mitoNGS / MToolBox

A bioinformatics pipeline to analyze mtDNA from NGS data
http://sourceforge.net/projects/mtoolbox/?source=navbar
GNU General Public License v3.0
90 stars 38 forks source link

Switched HF in high depth samples #79

Open jalwillcox opened 5 years ago

jalwillcox commented 5 years ago

Thank you for providing such a wonderful tool!

Full disclosure: I generated my data a while back, so I used MToolBox version 1.0. If this issue has already been addressed in the newer version, I'm sorry for the outdated post!

It looks like for some multi-allelic sites in some of my samples the order of the hetroplasmic fractions in the vcf does not match the order of the alternate alleles. For example, when comparing results from one individual's DNA and RNA at the position 10664 the lines from the vcf files were:

DNA: chrMT 10664 . T C . PASS AC=1;AN=2 GT:DP:HF:CILOW:CIUP 0/1:473:0.998:0.987:1.0

RNA: chrMT 10664 . T A,C . PASS AC=1,1;AN=3 GT:DP:HF:CILOW:CIUP 0/1/2:7409:0.999,0.001:0.997,0.0:0.999,0.002

IGVTools showed that position in the RNA as having an HF of 0.999 for the C allele instead of the A. This agrees with the DNA and seems more probable since 10664C is much more common than 10664A.

Unfortunately, I don't have a good sense as to where the issue may be stemming from, but I can say that it was rare and popped up most in samples with high depth (e.g. RNA). That could be because sites are more often reported as multiallelic in high depth samples, but I'm not positive.

I hope this is helpful and please let me know if more information is needed!

Jon

clody23 commented 5 years ago

Hi Jon

thanks for letting us know about this issue. We would need an example to be able to debug, I am afraid...

If you could share that with us we could look into it. Otherwise, I will keep an eye on high read depth sequencing I have available and see if I ever encounter this issue.

Meanwhile I am flagging this as a bug.

Thanks

Claudia