Open northwestwitch opened 2 years ago
5 Kb to 500 Kb is a lot, considering that the distance is applied on both sides of a gene. Wouldn't you end up with a lot of overlapping calls (that seem like to be in genes but aren't?)
This is happening for the particular fusion posted above. Zahra and I looked into couple of patients and we see the event, but we don't see the genes being annotated. IGH one luckily included, but BCL2 will always be missed unless it is happening within the gene itself.
I know it is going to be tricky, and possibly include a lot of clutter in the CSQ field so I suggest to add VEP --distance
only to SVs from Manta that are BND.
Here is my suggested the solution: in https://github.com/Clinical-Genomics/BALSAMIC/blob/c2fc1cd958e5e5403b08f2bbd98f46b8fcaa3e87/BALSAMIC/constants/workflow_params.py#L141-L143 add fusion_param
or something that you'd like. Then split SVDB VCF into each category, for BND ones, add fusion_param
to the vep_somatic_sv
rule.
fyi, in some other fusion events we will see this as well, example: RUNX1-ETV6:
So it will be beneficial for other customers who are curious to see these types of events. 😃
Another possible way to solve this would be having Scout panels containing not only genes, but chromosomal intervals (we've been discussing about this since forever but haven't found a solution yet). It's always in the back of our mind tho (https://github.com/Clinical-Genomics/scout/issues/1907).
Time to start thinking about this a but more seriously?
I think so, fusions need to be taken a bit more seriously. Looking at VCFs, they are easy to identify for a trained eye but tricky on Scout without gene annotation. Intervals can work, or a predefined genic regions might work as well (if it is done before uploading to Scout).
Another option might be a fusion annotation on the DNA side, noting if conceptual gene adjacency changed with the event.
E.g. SnpEff has some support fusion gene annotation. I haven't used it for a looong time. It still seems maintained though: https://github.com/pcingola/SnpEff. Some other customer seemed to like it for this purpose: https://github.com/Clinical-Genomics/BALSAMIC/issues/771.
I'm dropping here a user question/request from the customer support ticket (#372436):
Pertaining to the issue of SVs and specifically BND, we are trying to look for the following recurrent translocation in 3 of our samples analysed in the same ticket #698630. As seen below, the breakpoints can occur ~500kb up or downstream of the genes.
From Scout, we are unable to call these BNDs using the “BCL2” gene symbol, mainly because the default setting of the VEP annotation in the pipeline is to annotate ±5kb.
Below is how the BND looks in Scout and therefore was missed because it wasn’t annotated with BCL2.
Is it possible to change the default setting of VEP? @hassanfa can maybe explain this better….