ndaniel / fusioncatcher

Finder of Somatic Fusion Genes in RNA-seq data
GNU General Public License v3.0
141 stars 66 forks source link

Inconsistency between RNA sequence and protein sequence at the breakpoint #88

Open gilhornung opened 6 years ago

gilhornung commented 6 years ago

Hi Daniel,

We noticed an inconsistency between the Fusion_sequence and the protein sequence. It is a fusion between HEG1 and SLC12A8, and the reported fusion sequence is

TCGGGATACTTTCAGTTCAACAAGATGGACCACTCCTGCCGAG*AACATGGTTTCATTGGATATTCACCCGAACTGCTACAGAACAA

When translating this sequence compared to the protein sequence at the breakpoint (i.e. YFQFNKMDHSCREHGFIGYSPELLQ ), there seems to be a Glu (E) missing right at the fusion breaking point. The protein sequence is the correct one. I think that the problem is somehow related to the fact the there is a splice junction just inside the aforementioned Glu codon.

Below I attach an IGV image of the reads that support the fusion

All the files can be found at: https://owncloud.incpm.weizmann.ac.il/owncloud/index.php/s/w7q5Ej8vvdAtUjl

slc12ab_fusio_reads

ndaniel commented 6 years ago

Thanks! I confirm that indeed it is a bug.

ndaniel commented 5 years ago

So, finally I have had some time for looking into this but now the link above is broken. I am not sure if this can be fixed without it.

gilhornung commented 5 years ago

Try: https://owncloud.incpm.weizmann.ac.il/owncloud/index.php/s/ouFfguJbMXF9QxN