Open dnil opened 11 months ago
Hmm I wonder if the fix was maybe only implemented for TNscope (https://github.com/Clinical-Genomics/BALSAMIC/pull/540/files) even though VarDict was mentioned in the original issue (https://github.com/Clinical-Genomics/BALSAMIC/issues/485), and that possibly since VarDict is only used for TGA it hasn't been as much of an issue as for TNscope which is used for WGS since there was less of a chance that SVs would be called in the smaller panel context. I took a little look at some of the VCFs produced by some TGA cases and WES, and it seems that the VarDict SVs are more common in the WES which makes sense since it's a larger panel.
I suppose this issue has been around for a while and is probably not super urgent, but I'll bring this up on a refinement session. Possibly we should disable the SV calling from VarDict, or separate it and add to the SVDB merge. But since I have no idea how good VarDict is at calling SVs I'm not ready to say that we should do that yet.
Decision in refinement meeting 2024-01-12 to just remove the SV calls from VarDict
An update to this issue: We will need to keep the SVs in VarDict for now, as this is how we are calling FLT3-ITD at the moment and the clinicians are looking for this variant in the SNV and InDel results. But it would be nice if we could work out a way to clean this up for the future...
The above realisation occurred during testing of this PR https://github.com/Clinical-Genomics/BALSAMIC/pull/1414 which attempted to remove the SVs by adding the -U flag to the tumor only workflow as well (it had already been added to the tumor normal workflow).
Thank you for the attention to this - it's (occasionally) an annoying problem for the users, and it feels like it only awaits its first case of a misinterpreted causative variant, but so far I guess they manage.
Using VarDict for SV calling seems to make perfect sense, especially if it reproducibly finds the FLT3-ITD in contrast to others. But there should be no short read scenario where that variant ends up in the SNV file, with SNV annotation? Is there any reason why SVs produced by VarDict should not be split off into a SV VCF directly after calling, and treated as such for the remainder of the pipe?
Describe the bug VarDict SVs once again end up in the SNV VCF files, making them hard to visualise for the user when loaded in Scout.
To Reproduce Load e.g. case
novelbear
(or presumably grep for DEL, DUP, BND etc in recent VarDict SNV VCFs). Or sayExpected behavior Short nucleotide variants and structural variants should appear in different VCF files.
Screenshots
Version (please complete the following information):
12.0.2
Additional context This has been an issue in the distant past, which I believe was not fully solved together with the VarDict devs, but worked around with filtering by Balsamic on the VarDict VCFs into separate files for display & delivery.