single-cell-genetics / cellsnp-lite

Efficient genotyping bi-allelic SNPs on single cells
https://cellsnp-lite.readthedocs.io
Apache License 2.0
124 stars 11 forks source link

verbose mode #37

Closed vjbaskar closed 2 years ago

vjbaskar commented 2 years ago

Hi, I am trying to use cellsnp-lite with 10x data in Mode 2a. It is taking quite a bit of time, and the cpus are being used as expected. I would like to get more information on what cellsnp is doing. Is there a verbose mode that could do that?

Thanks!

hxj5 commented 2 years ago

Hi, Mode 2a is indeed slow for large dataset. To speed up, you may use strict filtering (--minCOUNT and --minMAF) or use alternative "Mode2b + Mode1a (calling in bulk mode followed by genotyping in single-cell mode)".

For now, cellsnp-lite only reports the percentage of SNPs or Chromosomes that have been processed. A verbose mode could be implemented in future release. Thanks for your feedback.

vjbaskar commented 2 years ago

Hi Xianjie Huang, Many thanks for the reply. It would be indeed good if we can run it faster - For eg. Run each chr separately, for which there is an option and finally aggregate the results later. In that way we can rerun specific failed chrs with more memory/cores. Thanks Vijay

hxj5 commented 2 years ago

Hi Vijay,

In mode 2a, if multi-cores are specified, cellsnp-lite will create a thread pool. Each chrom will then be pushed into one thread in the pool. Actually, when running one specific chrom, cellsnp-lite can make little speed improvement with multi-cores, as the dependent htslib mpileup is designed for single thread.

Xianjie