BUStools / bustools

Tools for working with BUS files
https://bustools.github.io/
BSD 2-Clause "Simplified" License
91 stars 23 forks source link

errors running bustools predict #73

Open deannachurch opened 3 years ago

deannachurch commented 3 years ago

Hi- I'm very interested in trying bustools predict. I have bustools version 0.41.0 and am trying to test various workflows using the 10x v3 PBMC data. I generated files using kb count -i GRCh38.rna.all.idx -g transcripts_to_gene.txt -x 10xv3 -o kb_output/ --filter bustools -t 4 $fastqlist and got this file list: drwxrwxr-x. 5 deanna.church deanna.church 336 Jul 27 14:35 .. drwxrwxr-x. 4 deanna.church deanna.church 404 Jul 27 14:35 . -rw-rw-r--. 1 deanna.church deanna.church 3549 Jul 27 14:35 kb_info.json drwxrwxr-x. 2 deanna.church deanna.church 120 Jul 27 14:35 counts_filtered -rw-rw-r--. 1 deanna.church deanna.church 1375793 Jul 27 14:35 output.filtered.bus -rw-rw-r--. 1 deanna.church deanna.church 1785 Jul 27 14:35 filter_barcodes.txt drwxrwxr-x. 2 deanna.church deanna.church 120 Jul 27 14:35 counts_unfiltered -rw-rw-r--. 1 deanna.church deanna.church 7355729 Jul 27 14:35 output.unfiltered.bus -rw-rw-r--. 1 deanna.church deanna.church 571 Jul 27 14:34 inspect.json -rw-rw-r--. 1 deanna.church deanna.church 115512960 Jul 27 14:34 10x_version3_whitelist.txt -rw-rw-r--. 1 deanna.church deanna.church 754 Jul 27 14:34 run_info.json -rw-rw-r--. 1 deanna.church deanna.church 2259708 Jul 27 14:34 transcripts.txt -rw-rw-r--. 1 deanna.church deanna.church 50931753 Jul 27 14:34 matrix.ec -rw-rw-r--. 1 deanna.church deanna.church 580401905 Jul 27 14:34 output.bus

but when running bustools predict (from the parent of this)

(kb) [deanna.church@cose011 10x_v3_pbmc]$ bustools predict -o kb_output/ -t 2 kb_output/counts_filtered/ Error: Matrix file missing: kb_output/counts_filtered/output.mtx Error: Genes file missing: kb_output/counts_filtered/output.genes.txt Error: Barcodes file missing: kb_output/counts_filtered/output.barcodes.txt Error: CPU histograms file missing: kb_output/counts_filtered/output.hist.txt. Did you forget the --hist flag when running count? Usage: bustools predict [options] count_output_dir

Is this a workflow issue or a predict issue (or just a user issue)? Thanks in advance!

deannachurch commented 3 years ago

I have trouble shot this a bit more- it seems bustools predict expects the files to all begin with output, but the count line created files beginning with cells_x_genes. I can work around this for now, but seems like it would be nice to fix at some point. thanks!

johan-gson commented 3 years ago

Hi,

Predict is simply not supported by kb at this point. Count needs to be run with the --hist flag before running predict, but maybe you figured that out.

deannachurch commented 3 years ago

Yes I did-I ran counts as this: bustools count -o kb_output/counts_filtered/cells_x_genes -g transcripts_to_gene.txt -e kb_output/matrix.ec -t kb_output/transcripts.txt --genecounts --hist kb_output/output.filtered.bus

and all of the output files begin with 'cells_x_genes.X.txt|mtx'

when you run predict as: bustools predict -t 2 -o kb_output/ kb_output/counts_filtered/

it fails as it can't find the files as it is looking for files beginning 'output.*.txt|mtx'

the solution was to create symlinks with the output pattern.

lrwxrwxrwx. 1 deanna.church deanna.church 22 Jul 28 14:43 output.hist.txt -> cells_x_genes.hist.txt lrwxrwxrwx. 1 deanna.church deanna.church 26 Jul 28 14:42 output.barcodes.txt -> cells_x_genes.barcodes.txt lrwxrwxrwx. 1 deanna.church deanna.church 23 Jul 28 14:42 output.genes.txt -> cells_x_genes.genes.txt lrwxrwxrwx. 1 deanna.church deanna.church 17 Jul 28 14:42 output.mtx -> cells_x_genes.mtx -rw-rw-r--. 1 deanna.church deanna.church 2257 Jul 28 14:34 cells_x_genes.CUPerCell.txt -rw-rw-r--. 1 deanna.church deanna.church 488472 Jul 28 14:34 cells_x_genes.cu.txt -rw-rw-r--. 1 deanna.church deanna.church 77117 Jul 28 14:34 cells_x_genes.hist.txt -rw-rw-r--. 1 deanna.church deanna.church 1785 Jul 28 14:34 cells_x_genes.barcodes.txt -rw-rw-r--. 1 deanna.church deanna.church 296907 Jul 28 14:34 cells_x_genes.genes.txt -rw-rw-r--. 1 deanna.church deanna.church 21043 Jul 28 14:34 cells_x_genes.mtx

So- I can work around, but it is less than ideal.

johan-gson commented 3 years ago

Hmm, try the following:

bustools predict -t 2 -o kb_output/ kb_output/counts_filtered/cells_x_genes

As I remember it it should work.

deannachurch commented 3 years ago

This throws an error:

` bustools predict -t 2 -o kb_output/counts_filtered/cells_x_genes

terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::at: __n (which is 18446744073709551615) >= this->size() (which is 0) Aborted `

(I substituted your path with the actual path)

johan-gson commented 3 years ago

So, the -o sets the output path (which I did set to "kb_output/", not sure if that is right), while the argument after all flags, which in my case was "kb_output/counts_filtered/cells_x_genes", specifies where to find the input. So, your line above is incorrect, it should be similar to what I wrote. I think you mistook that for a single argument with a space, but these are two different arguments, where the first belongs to -o.

deannachurch commented 3 years ago

bingo- is see the issue now. thanks for your patience with my errors.

johan-gson commented 3 years ago

Nice, good luck with your project!