Clinical-Genomics / scout

VCF visualization interface
https://clinical-genomics.github.io/scout
BSD 3-Clause "New" or "Revised" License
152 stars 46 forks source link

MT DNA freq guesstimate on each variant #5047

Open dnil opened 1 week ago

dnil commented 1 week ago

Is your feature request related to a problem in the current program to new available techology or software? Please describe and add links/citations if appropriate. We currently export AD for ref and alt. It would be convenient for the end users to have a frequency guesstimate based on thes.

Describe the solution you'd like Divide AD alt with sum of AD alt and ref. Present as a percentage on the mtDNA report.

Additional context It may be necessary to deal with multiallelics, discussion ongoing.

dnil commented 1 week ago

In summary after a good discussion with @ramprasadn, we may indeed have multiallelics on the MT side, to be accounted for. Currently (MIP) we have a step with bcftools norm splitting, but that will produce poor AD values. We should investigate if we can prioritise AF in parsing for MTs as well as for Balsamic. However, DeepVariant does not produce a FORMAT.AF currently - it is assumed that GT is sufficient and better reflecting ground truth in an ordinary, non-mosaic germline case here, so we can't just use it for everything.

raredisease will going forward use --keep-sum for bcftools norm so the AD counts should be better in a bit.