pachterlab / kallisto

Near-optimal RNA-Seq quantification
https://pachterlab.github.io/kallisto
BSD 2-Clause "Simplified" License
655 stars 172 forks source link

lr-kallisto not producing hdf5 files for bulk long-read RNA seq data #457

Closed apsteinberg closed 1 month ago

apsteinberg commented 2 months ago

Hello there,

I am very excited to start using kallisto with some of our bulk, long-read RNA seq data. For the analysis I am conducting, I wanted to use kallisto to quantify read counts, and then use sleuth for downstream statistical analysis. However, when I follow the workflow described here and use the -b 100 flag with kallisto quant-tcc, no abundance.h5 file is produced. When I look at the output log, which I have attached here (I ran this within a snakemake pipeline): kallisto_logs.zip It seems to suggest an HDF5 file with bootstrap replicates should be produced (see line 27 of the attached log file and below):

[index] k-mer length: 63
[index] number of targets: 237,664
[index] number of k-mers: 149,478,581
[index] number of equivalence classes loaded from file: 286,611
[tcc] Parsing transcript-compatibility counts (TCC) file as a matrix file
[tcc] Matrix dimensions: 1 x 286,611
[tcc] Bootstrapping will be performed and outputted as HDF5
[quant] Running EM algorithm...
[quant] Processing sample/cell 0
 done

The outputs I do see are as follows, and I don't know if perhaps I am confused and one of these is the HDF5 file I can use with sleuth?

-rw-rw-r-- 1 preskaa preskaa 347K Aug 28 10:12 matrix.abundance.gene.mtx
-rw-rw-r-- 1 preskaa preskaa 466K Aug 28 10:12 matrix.abundance.gene.tpm.mtx
-rw-rw-r-- 1 preskaa preskaa 1.3M Aug 28 10:12 matrix.abundance.mtx
-rw-rw-r-- 1 preskaa preskaa 1.5M Aug 28 10:12 matrix.abundance.tpm.mtx
-rw-rw-r-- 1 preskaa preskaa 883K Aug 28 10:12 matrix.efflens.mtx
-rw-rw-r-- 1 preskaa preskaa    6 Aug 28 10:12 matrix.fld.tsv
-rw-rw-r-- 1 preskaa preskaa 5.1M Aug 28 10:12 transcript_lengths.txt
-rw-rw-r-- 1 preskaa preskaa 4.1M Aug 28 10:11 transcripts.txt

Thanks in advance for your time and help.

Best, Asher

bound-to-love commented 1 month ago

Hi, Asher, are you including --matrix-to-files in your quant-tcc command?

apsteinberg commented 1 month ago

Hi Rebekah,

Thanks for getting back to me! I did end up adding that option in the end and it worked. Very excited to start playing around with the results. If I may suggest, it could be useful to detail this in the description of this flag or in the manual. I ended up figuring it out by going through the old release notes on github.

Cheers, Asher

bound-to-love commented 1 month ago

Let me know if you have any other questions! Thanks for the feedback, Asher!