Closed rebber closed 1 year ago
Hi Rebecka,
SAGE calls a candidate variant for each unique chr,pos,ref,alt AND readContext. The readContext includes 2 flanking bases on each side and sometimes more if there is a repeat which makes the variant ambiguous, so it is possible and expected that we may have multiple records with the same variant but different context. The typical case where this may occur is when you have 2 somatic variants within 2 bases of each other which is partially phased (ie all reads of variant B contain variant A, but only some reads with variant A contain variant B). The most common case for this is a mixed somatic germline MNV. Another case is when you call the same variant, but with repeat contexts of different length (such as your example above at 8:13988790).
Later when we phase, we use the ref context to help with the phasing, including for identifying partially phased variants. Only PASS variants are considered for this phasing. If variants are partially phased we will normally call a phased MNV and an unphased SNV, with the duplicate SNV deduped by the MNV.
In practice, I think it is very unlikely that we call 2 identical variants with different read contexts where both variants PASS, but I cannot rule this out.
So in summary: this is expected behaviour, but should be exceptionally rare to have 2 duplicate variants that both PASS.
Peter
Hi,
I'm running SAGE v 3.2.3 and have discovered that some variants appear multiple times in the same vcf. CHROM, POS, REF, ALT is the same, but the read context (INFO tag RC) differs slightly and thus also other tags related to read context.
Examples:
Why is it that they're output separately like this? How should I think about the quality for such variants and how to interpret them? E.g. the chr 7 case, where one entry says FILTER =PASS and the other is filtered out? Do you have any recommendations for merging the duplicate variants, or how to prioritize between them?
I have run with the
-panel_only
argument because I have targeted data, and have adjusted options according to https://github.com/hartwigmedical/hmftools/blob/master/README_TARGETED.md but with a lower min VAF. This is the exact command I have run:Best regards Rebecka