single-cell-genetics / cellsnp-lite

Efficient genotyping bi-allelic SNPs on single cells
https://cellsnp-lite.readthedocs.io
Apache License 2.0
124 stars 11 forks source link

AF for candidate SNP #66

Closed Zepeng-Mu closed 1 year ago

Zepeng-Mu commented 1 year ago

Hello, this line of code suggests the SNPs with AF<0.0005 are removed from list of provided candidate SNPs. https://github.com/single-cell-genetics/cellsnp-lite/blob/0885d746b0b1ea65c8ef92f8943ca7669ca9734a/scripts/SNPlist_1Kgenome.sh#L48 But this AF filed ranges between (0, 1). Does this mean SNPs with AF>0.9995 are kept. These are probably also rare SNPs but the minor allele is coded as ALT in the 1kg file.

Thanks

hxj5 commented 1 year ago

Hi, you are right. There are some SNPs with AF>0.9995, e.g., about 0.08% (29439/36573628) SNPs in the file genome1K.phase3.SNP_AF5e4.chr1toX.hg19.vcf.gz have AF>0.9995.

These SNPs, if existing with high AF (e.g., AF>0.9995) in certain donor sample, could be filtered by properly setting cellsnp-lite --minMAF (e.g., --minMAF 0.1).

Zepeng-Mu commented 1 year ago

I see. That makes sense. Thanks!