Open MattWellie opened 7 months ago
This is sourced from the public Broad references
$gsutil cat gs://gcp-public-data--broad-references/hg38/v0/exome_calling_regions.v1.interval_list
...
chrM 427 16173 ...
It's only a few kb, we could easily subtract it from our exome interval list, but I'm curious what is the downside to calling variants in this region? Is it just that it's better to leave them to our mito specific pipeline?
The downside here is that Mito calling is fundamentally different - instead of 2 chromosomes, cells contain a load of mitochondrial genome copies, instead of WT/Het/Hom, mito calling is a continuous range of % Mito genomes with a variant. In that sense it's more like somatic/cancer analysis where variants can be picked up at really low levels whilst still being true, so it takes a different approach to get clean results.
HaplotypeCaller and JointGenotyping aren't optimised for that - it's not to say the results are bad or wrong, just that we've probably not looked into the quality of calls in this region.
Our mito calling pipeline is currently untested on exomes, but should work perhaps with some minor tweaks. Different exome captures are variable in how many mito reads they return but modern ones have decent coverage so we should move to support this. Once that is done we should remove Mt from our target intervals.
Recently Paul (VCGS) flagged that Mito variants are coming through into Exome AIP reports
The final line of
gs://cpg-common-main/references/hg38/v0/exome_calling_regions.v1.interval_list
ischrM 427 16173 ...
The whole-genome equivalent
gs://cpg-common-main/references/hg38/v0/wgs_calling_regions.v1.interval_list
terminates at chrYWe have a separate mito calling workflow, so is this accidental calling?