Open katajar opened 3 years ago
The problem with mitochondria is relative abundance.
For autosomes, normal samples always contain two copies, and for allosomes, 0/1/2 depending on sex+allosome combination.
In contrast, the amount of mitochondrial particles present will vary semi-randomly from sample to sample. It also depends on the tissue type, the assay used and other factors. So there will be a huge baseline variance of total chrM coverage.
Because of this, CNVkit always excludes mitochondria from the analysis entirely, to avoid them messing with the normalisation algorithms: https://github.com/etal/cnvkit/blob/9dd1e7c83705d1e1de6e6e4ab9fdc6973bf4002f/cnvlib/antitarget.py#L115-L122
I suppose it should be technically possible to adjust for this variance by normalising and centering chrM coverage separately of all other chromosomes. However, it would take substantial work to add this functionality. If you or someone else would like to submit a pull request, I'll be happy to review it, but unfortunately I don't have the bandwidth to do this myself.
Out of curiosity, is there a research use case to study chrM copy number variance? Perhaps in cancers?
Thank you so much for the fast reply. Now I more understand the issue. I just wanted to see differences between mitochondrial sequences from various yeast strains such as big deletions, amplifications and numbers of copies. From me it was only loose question but indeed I found a few publications referring to mtDNA copy number variations in cancers.
@katajar I see, thank you for the context.
I've been thinking about this, and actually there is a way for you to analyse mitochondria while avoiding huge modifications to CNVkit. The approach could work like this:
This should be reasonably straightforward to implement. And in fact, if you decide to proceed with this analysis, I would very much appreciate your feedback. This can perhaps be used in the future to improve the way CNVkit handles mitochondrial and other irregular sequences in human samples.
Is it possible to use cnvkit to analyze mitochondrion genome? Is it necessary to drop chrM from reference.cnn?