berman-lab / ymap

YMAP - Yeast Mapping Analysis Pipeline : An online pipeline for the analysis of yeast genomic datasets.
MIT License
6 stars 6 forks source link

Aberrent SNP ratios in homozygous regions. #41

Open darrenabbey opened 8 years ago

darrenabbey commented 8 years ago

The Cyberlindnera fabianii figure illustrated in issue #38 revealed an issue in analysis and/or visualization.

2b53fc0e-51ab-11e6-8dd6-aaba8d6a9065

Chromosomes [7, 11, 16, 17] are, by a reasonable approximation, entirely homozygous. They have so little SNP data that the minimal heterozygous signal which is present is overwhelmed by the peculiar error in sequencing data that makes it appear that there is a 1:2 ratio of SNP alleles in low-information regions, even in this diploid case where a 1:2 ratio is inconsistent.

In cases like this, the SNP-ratio histogram should probably be scaled down to avoid misrepresenting the error signal as an informative heterozygosity signal. A difficulty here is in determining the appropriate heuristic to separate chromosomes that are entirely homozygous like these from others that are just mostly homozygous.

darrenabbey commented 8 years ago

This issue may or may not be related to issue 12.

vladimirg commented 7 years ago

This is important to #52, as automatic ploidy detection can be thrown off by this type of error.