Open alexfernandes8a opened 1 year ago
Hi, Mode 2a is more suitable for small datasets. For large datasets, you may try Mode 2b + Mode 1a
. Mode 2a does joint calling and genotyping, but it is substantially slower than calling first in a bulk manner by Mode 2b followed by genotyping in Mode 1a. To speed up, you may try --minMAF 0.1 --minCOUNT 100
options in both modes.
Hi! I have a huge 10X scRNA-seq mouse data (~60Gb BAM file | ~50K cells from 12 mice) that I am trying to run on cellSNP-lite. I compiled cellSNP-lite in an HPC environment and I am running it from there on the mode 2A. The problem is, no matter how much RAM I am using, I am constantly getting the message "Combined max depth is above 1M. Potential memory hog!" and it has been running for 11 days already. I know it is a lot of data and I am wondering what would be the best approach in that scenario? Perhaps split the cell barcodes file? Any help is highly appreciated! Thank you so very much.