Open arahuja opened 8 years ago
What happens when a somatic variant is mapped to an ALT in the tumor sample, but the major contig in the normal sample?
A false positive somatic variant.
Maybe for now we should simply avoid calling variants in polymorphic regions?
On Mon, Mar 7, 2016 at 2:56 PM, Arun Ahuja notifications@github.com wrote:
Current variant calling/Mutect parallelization only uses major contigs right now. But, with B38, this would drop ALT contigs
Not dropping them makes sense, but in general variant calling likely needs to be rethought for ALT contigs. What happens when a somatic variant is mapped to an ALT in the tumor sample, but the major contig in the normal sample?
— Reply to this email directly or view it on GitHub https://github.com/hammerlab/biokepi/issues/160.
Another thing to watch out for is effect on mapq
- Does BWA work with ALT contigs in the GRCh38 release? Yes, since 0.7.11, BWA-MEM officially supports mapping to GRCh38+ALT. BWA-backtrack and BWA-SW don't properly support ALT mapping as of now. Please see README-alt.md for details. Briefly, it is recommended to use bwakit, the binary release of BWA, for generating the reference genome and for mapping.
- Can I just run BWA-MEM against GRCh38+ALT without post-processing? If you are not interested in hits to ALT contigs, it is okay to run BWA-MEM without post-processing. The alignments produced this way are very close to alignments against GRCh38 without ALT contigs. Nonetheless, applying post-processing helps to reduce false mappings caused by reads from the diverged part of ALT contigs and also enables HLA typing. It is recommended to run the post-processing script.
This page shows some examples: https://github.com/lh3/bwa/blob/master/README-alt.md
If we align sequence reads to GRCh38+ALT blindly, we will get many additional reads with zero mapping quality and miss variants on them.
Similarly, it seems STAR suggests dropping the ALT contigs: https://github.com/alexdobin/STAR/issues/39#issuecomment-101214342
Current variant calling/Mutect parallelization only uses major contigs right now. But, with B38, this would drop ALT contigs
Not dropping them makes sense, but in general variant calling likely needs to be rethought for ALT contigs. What happens when a somatic variant is mapped to an ALT in the tumor sample, but the major contig in the normal sample?