Kurt-Hetrick / CIDR_SEQ_CAPTURE_JOINT_CALL

CIDR exome and targeted resequencing joint calling/filtering pipeline
0 stars 0 forks source link

investigate difference in snp count per sample between hg19 and grch37 #78

Closed Kurt-Hetrick closed 4 years ago

Kurt-Hetrick commented 4 years ago

roughly 400 less snps per person in hg19 for peters. references are different. grch37 has decoy sequences and hg19 has alternate haplotypes, but should figure out if there is a general feature for the difference.

Kurt-Hetrick commented 4 years ago

this is about as expected. grch37 is getting a lot of snp calls from the MHC region (HLA-A/HLA-C, MAS1L, RING1, TAP2) at higher frequencies which account for at least 600 loci. some offsets that are hg19 only are the typical MUC4 and CDC27 which are due the lack of decoy sequence, accounts for about 100 loci. other hg19 only loci are BCLAF1, ARSD, CSMD1, RYR1, SYNE1 which acount for another 120 or so. Not familiar with these, but assuming that these are due to a lack of decoy sequence.