suhrig / arriba

Fast and accurate gene fusion detection from RNA-Seq data
Other
226 stars 50 forks source link

Fusion transcript mismatch between the same breakpoints #240

Closed gitgodgot closed 4 months ago

gitgodgot commented 4 months ago

Hello Thanks to you, I'm getting a lot of help from fusion gene analysis.

I found the same fusion gene in two different samples and it's DHX9::NPL. In the two samples, breakpoint 1 and 2 match each other, but fusion transcript and peptide sequence are different. I'm wondering if the breakpoint matches, but the sequence might be different. Why did this result? I earnestly ask for your help.

suhrig commented 4 months ago

Hi, I'll gladly help. The breakpoint coordinates aren't the only thing that is considered for the transcript selection. The entire fusion sequence is considered. Arriba tries to find the transcript which best explains the splice pattern of the fusion-supporting reads. If you're familiar with IGV, roughly does the following: It takes the sequence from the column transcript_sequence then goes to Tools -> Blat and finds the transcript which has the best overlap with the coordinates returned by Blat. Technically, it gets there completely differently, but that's a good way to visualize it. If you do this with the sequences from your two samples, you should notice that even though the breakpoints are the same, the fusion sequences align differently. Let me know if anything is still unclear.

gitgodgot commented 4 months ago

Thank you for your quick response It was a great help. As you told me, I checked through IGV and it was definitely easy to understand. Thanks a lot :) 👍