berman-lab / ymap

YMAP - Yeast Mapping Analysis Pipeline : An online pipeline for the analysis of yeast genomic datasets.
MIT License
6 stars 6 forks source link

Standard presentation in datasets with a parental strain can be confusing #51

Open vladimirg opened 8 years ago

vladimirg commented 8 years ago

Currently, the standard colors for a dataset that was analyzed against a parental strain (without hapmap) use red to denote changes from a 0.5 allelic ratio (e-mail correspondence with Anna Selmecki, September 2016). This behavior is the intended behavior (Abbey et al. Genome Medicine 2014, page 6, 2nd paragraph from the end [1]; e-mail correspondence with Darren Abbey, April 2015), but is not intuitive, as in the above paper, the red/green coloring (known as "alternate colors" in the current Ymap version) is suggested as the standard - page 6, last paragraph in the left column [2].

The rationale of having red in the standard colors is in case a hapmap is specified, but is not complete, and so there are regions of heterozygosity that are not covered by the hapmap. In such a case, those alleles cannot be phased and will be colored red (as opposed to cyan, magenta, or any other color for phased alleles) (e-mail correspondence with Darren Abbey, April 2015).

However, in this case, a hapmap is explicitly not specified, so it's worth considering changing the standard colors to the alternate colors, only in this scenario.

[1]

When a parental type strain of unknown genotype (for example, a clinical isolate) is selected for a project, the pipeline first calculates the distribution of SNPs across the parental genome in the manner described above. For comparison of the parental genotype to another related strain (for example, another sample from the same pa- tient), every heterozygous SNP locus in the parent is ex- amined in the second dataset. If the allelic ratio changes from the 0.5 value observed in the reference strain, the SNP is assigned a red color and the final color of each 5,000 bp display bin is calculated as the weighted aver- age of all the SNPs within the bin (Figure 5B).

[2]

In the second style of analysis, a parent strain is chosen and the SNPs in common between that parent and the test strain being analyzed are displayed as grey bars (as in the first style), while any SNPs in the parent that have different allelic ratios in the test strain are displayed in red, if allelic ratios ap- proach 0 or 1, or in green, if ratios suggest unusual allele numbers (often due to CNVs or aneuploidy).