Closed julesjacobsen closed 3 years ago
Yes, I resorted to implementing BreakendedSvannaVariant
from scratch. The class implements SvannaVariant, Breakended
, where SvannaVariant
interface extends Variant
and some other interfaces that add bits specific to SvAnna.
I think having BaseBreakendedVariant
would be useful.
OK will add one. Will need one for Exomiser too...
In progress on this one. So far I've made a few changes such as:
BreakendVariant
to DefaultBreakendVariant
Breakended
to BreakendVariant
BreakendVariant
to BreakendVariant extends Variant
Breakend
@ielis I have a question which I want to double-check something. Can you confirm that the trailingRef
variable is never used? From what I can see it's only ever set as an empty string in the constructor, so its a bit of legacy code which has been replaced with the existence of the VcfBreakendFormatter
and soon the VcfBreakendVariantParser
(issue #36)
/**
* When variant is converted to the opposite strand, then the reverse complemented ref allele is stored here.
* We need to store the allele in order to reconstruct the original ref allele if variant is converted again to the
* original strand.
*/
private final String trailingRef;
@Override
public DefaultBreakendVariant toOppositeStrand() {
return new DefaultBreakendVariant(eventId, right.toOppositeStrand(), left.toOppositeStrand(),
Seq.reverseComplement(trailingRef), Seq.reverseComplement(ref), Seq.reverseComplement(alt));
}
works like this:
org.monarchinitiative.variant.api.impl.DefaultBreakendVariantTest,toOppositeStrand
// 13 123457 bnd_X A [17:198983[A 6 PASS SVTYPE=BND;MATEID=bnd_Z;EVENT=tra3
Original: BreakendVariant{eventId='tra3', left=Breakend{contig=13, id='bnd_X', strand=-, coordinateSystem=FULLY_CLOSED, startPosition=115046422, endPosition=115046422}, right=Breakend{contig=17, id='bnd_Z', strand=+, coordinateSystem=FULLY_CLOSED, startPosition=198983, endPosition=198983}, ref=A, alt=''}
Opposite: BreakendVariant{eventId='tra3', left=Breakend{contig=17, id='bnd_Z', strand=-, coordinateSystem=FULLY_CLOSED, startPosition=83058459, endPosition=83058459}, right=Breakend{contig=13, id='bnd_X', strand=+, coordinateSystem=FULLY_CLOSED, startPosition=123457, endPosition=123457}, ref=, alt=''}
Is it still needed, or could be replaced with a constant empty string? I suspect this probably isn't always the case, for example if a toOppositeStranded BreakendVariant is copied e.g.
BreakendVariant transformed = BreakendVariant.builder().with(original.toOppositeStrand()).build().toOppositeStrand()
transformed
should equal original
Just a note- might be easier to add a sequence
to the Breakend
s, so it is always attached to the correct end and store the inserted sequence, if any, on the BreakendVariant
although this is what the alt
field currently does.
See commit: 967ff30
Hi @julesjacobsen ,
the trailingRef
field is used in toOppositeStrand()
method, where it is passed instead of ref
. It's not a legacy code.
When creating a breakend variant from a VCF record, we have base sequence for ref allele for left breakend only. In case we convert the breakend variant to the opposite strand, we must set ref
to an empty string, since we do not have the base sequence for the opposite strand. If we want to convert the variant back to the original strand, we will miss the ref
allele if we do not store it. That's why we store the ref allele into trailingRef
field and use the field during the strand conversion.
@ielis, it's really just a bit of a mess which is why I dislike it. This is related to the question in issue #39. The problem arises due to the inclusion of the 'left' ref base, but not that on the 'right', so when flipping the strands you end up losing information.
The solutions are:
Variant
types)this
when trying to do a BreakendVariant.toOppositeStrand()
. This way nothing changes, and nothing is screwy. The reason being if you're to assess whether a breakend variant overlaps a region (i.e. disrupts the region), you'll need to check each Breakend
independently by calling breakend.toStrand(targetRegion.strand())
. This is my preferred solution and I think has the lowest wtf score.I agree that the solution 2. is not a good one as we should not expect the users to do much in order to make svart's classes work best.
Let's go for the solution 3 then, we can check the breakends independently.
Great - will do 3 is also easiest time implement ;)
I think adding a Variant.isBreakend()
method will help with this. Will need to check that isSymbolic() != isBreakend()
Unfortunately BreakendVariant.isSymbolic() == true
so will need to have a think about what defines 'symbolic' as opposed to breakend/non-symbolic. In VCF parlance I think a breakend is symbolic, so we'll need to think about wether to change the definition here.
It's possible to do a simple instanceof
test, but these always feel a bit out of place.
@ielis didn't you need to implement your own version of a BreakendVariant? This would have been useful right?