suhrig / arriba

Fast and accurate gene fusion detection from RNA-Seq data
Other
214 stars 50 forks source link

Is it possible to have draw_fusions.R output the exon number in text? #223

Closed ahdee closed 7 months ago

ahdee commented 7 months ago

Hi - thanks for awesome fusion caller - one of the best hands down. One thing I'm trying to do is to find a way to export the exon number ( where the break occurs) during the time it draws the fusion.

not sure if its possible, but I'm looking at the code at the moment and I can see that the lines drawn are based of some sort of coordinate. For example.

                    geneID geneName        transcript exonNumber       left      right
191012 ENSG00000171094.180      ALK ENST00000389048.8         29 0.00000000 0.06156816
191015 ENSG00000171094.180      ALK ENST00000389048.8         28 0.02429475 0.06156816
191016 ENSG00000171094.180      ALK ENST00000389048.8         28 0.07229433 0.07712110
191017 ENSG00000171094.180      ALK ENST00000389048.8         27 0.07229433 0.07712110
191018 ENSG00000171094.180      ALK ENST00000389048.8         27 0.08784726 0.09503379
191019 ENSG00000171094.180      ALK ENST00000389048.8         26 0.08784726 0.09503379

it draws the lines surrounding the exons like so.

 lines(c(0, 0, fusionOffset1), c(yTrajectoryExonTop, yTrajectoryExonBottom, yTrajectoryFusion), col="red", lty=2)
    lines(c(breakpoint1, breakpoint1, fusionOffset1+breakpoint1), c(yTrajectoryBreakpointLabels, yTrajectoryExonBottom, yTrajectoryFusion), col="red", lty=2)

so the question what would i have to do is extract the "exonNumber" border for the dfs exons1 and exons2. Here is a picture of the output and I can see visually that it breaks on IGFBP5: exon 1 and ALK: exon 18 however I would like to have the ability to extract this as text. image

This will be extremely useful for me, thanks in advance!

suhrig commented 7 months ago

There is a dedicated script to annotate exon numbers:

https://github.com/suhrig/arriba/blob/master/scripts/annotate_exon_numbers.sh

No need to export it from draw_fusions.sh. If you still want to export it from draw_fusions.sh, let me know and I can help you with this.

ahdee commented 7 months ago

thanks @suhrig - if possible it would be great if I can export from the draw_fusions.R - the reason is because I've actually extracted the functions from it and use it universally to draw fusions from agnostically from other callers as well. Thus it would be great if I can just have it as single R function. thanks as usual.

suhrig commented 7 months ago

Here is a modified version of draw_fusions.R which prints the exons closest to the fusion breakpoints:

draw_fusions_exon.R

As you can see, I only added 4 lines: 990, 991, 1000, 1001

ahdee commented 7 months ago

@suhrig thank you so much. This will help tremendously. I also made an ugly hack earlier this week by stealing it from the render itself, basically sucking up all the exons and towards the end did something like this.

if ( test$direction1 == "upstream"){
this_exon_left = exons1 [ min (this_exon_left) - 1, ]$exonNumber
} else {
  this_exon_left = exons1 [ max (this_exon_left) - 1, ]$exonNumber
}

if ( test$direction2 == "downstream" ){
this_exon_right = exons2 [ max ( this_exon_right) - 1, ]$exonNumber
}else {
  this_exon_right = exons2 [ min (this_exon_right) + 1, ]$exonNumber
}