pachterlab / kallistobustools

kallisto | bustools workflow for pre-processing single-cell RNA-seq data
https://kallistobus.tools/
MIT License
115 stars 29 forks source link

How to specify filtered barcodes to generate a loom file? #59

Open micosacak opened 8 months ago

micosacak commented 8 months ago

First of all,

thank you for this nice tool, really helpful for single-cell analyses. I have been using cell ranger, but now want to compare kb with cell ranger. It is really fast (10 mins for kb versus 3h for cellranger using my test sample)!!!

However, I have problems:

My first problem is memory error. It happens when I run kb count --loom. By default it uses all barcodes to create a loom file, but there is a memory error.

kb count --filter --loom --cellranger -t 1 -m 500G. With 500GB RAM and 1 thread, still it raises the same error as below.

numpy.core._exceptions._ArrayMemoryError: Unable to allocate 461. GiB for an array with shape (32521, 1903511) and data type float64

I could not find a documentation about using the filtered barcodes to generate a loom file.

my questions:

(1) How can I specify a barcode list to be used to generate a loom file. Instead of 1903511 barcodes, the filtered barcodes will be around 10000 barcodes.

(2) by specifying --cellranger, it does not create a cellranger compatible file in counts_filtered folder. But there is one in counts_unfiltered one.

p.s. I know I can use filtered_barcodes to subset counts_unfiltered/cellranger, but if there is a solution in kb, that will be great.

(3) is there a way to convert bus files to BAM format?

sorry for adding more questions in the same issue!

thank you very much