Open mauripops opened 1 week ago
The VCF does include intergenic variants, most of which are expected to be filtered during pileup by cellsnp-lite because they are expressed/present in few reads of scRNA-seq data. However, some expressed intergenic variants (could be due to technical reasons or imperfect gene annotations, etc) could provide additional genotype information for vireo to distinguish different donors.
I am currently working on processing a vcf file for the SNPs for mice from data from the Wellcome Sanger Mouse Genome Project as listed in here: https://www.jax.org/research-and-faculty/genetic-diversity-initiative/tools-data/diversity-outbred-reference-data
Following the processing as explained for human data here: https://github.com/single-cell-genetics/cellsnp-lite/blob/master/scripts/SNPlist_1Kgenome.sh Resulting in the data here: https://sourceforge.net/projects/cellsnp/files/SNPlist/
I was wondering, do the SNPs provided above for humans contain intergenic variants?
My endgoal is to run cellsnp-lite for demultiplexing scRNAseq reads using vireo. Should I remove the intergenic variants? Or is there a reason they should be kept?