single-cell-genetics / vireo

Demultiplexing pooled scRNA-seq data with or without genotype reference
https://vireoSNP.readthedocs.io
Apache License 2.0
73 stars 27 forks source link

Which VCF files to use for SNP calling? #44

Open ShouWenWang opened 2 years ago

ShouWenWang commented 2 years ago

Hi, thanks for developing this tool for genotyping. I performed SNP calling with cellsnp-lite, with either (7.4M SNPs with minor allele frequency (MAF) > 0.05) or (36.6M SNPs with minor allele frequency (MAF) > 0.0005), and then run VireoSNP. The results are not the same. What VCF reference files should I use, and what is the rationale? Thanks!

huangyh09 commented 2 years ago

Hi,

Sorry for the delay. Inconsistence may occasionally happen due to local optima. Normally, it will be more stable if increasing the number of random initialization by -m, e.g., 100 or 200.

Usually, both references are sufficiently large (so 7.4M is commonly recommended). Also when running cellsnp-lite, we normally only keep SNPs with MAF>0.1, though it also depends on the number of donors you have pooled. On the other hand, if the coverage is extremely low for each cell, you may consider genotyping donors in a de novo mode (cellsnp-lite mode 2), besides trying the 36.6M list.

Yuanhua