ndaniel / fusioncatcher

Finder of Somatic Fusion Genes in RNA-seq data
GNU General Public License v3.0
141 stars 67 forks source link

Question about the filter applied. #181

Closed dtomas1989 closed 3 years ago

dtomas1989 commented 3 years ago

Dear all,

Inside the file called "info.txt" I found the following fusion candidate:

ENSG00000118058 ENSG00000130396 21 KMT2A AFDN known,oncogene,chimerdb2,cgp,ticdb,tcga,cell_lines,chimerdb3kb,chimerdb3pub,chimerdb3seq,cancer,tumor,tcga-cancer,mitelman further_analysis 0

This fusion is very instersting for me but I don't understand why it doesn't appear in the final file (final-list_candidate_fusion_genes.txt).

Can you help me to understand it?

Thank you in advance, Best regards

ndaniel commented 3 years ago

Hello,

it means that there are some paired-end reads that support the fusion but FusionCatcher was not able to find the fusion junction (ie. reads that map over the fusion junction).

Cheers, Daniel

dtomas1989 commented 3 years ago

First of all, thank you so much for you quick answer. The info.txt file contains the following columns: Fusion_gene_1, Fusion_gene_2, Count_paired-end_reads, Fusion_gene_symbol_1, Fusion_gene_symbol_2, Fusion_description, Analysis_status, Counts_of_common_mapping_reads

What is the difference between Count_paired-end_reads and Counts_of_common_mapping_reads?

I ask this because the followgin fusion is in the final file, it has 0 of Counts_of_common_mapping_reads as well:

ENSG00000014216 ENSG00000181163 28 CAPN1 NPM1 oncogene,cancer,tumor,m24,t33 further_analysis 0

Thank you so much in advance, it is very important for me, Best regards

ndaniel commented 3 years ago

Count_paired-end_reads are the reads like R1/1, R1/2, R2/1, R2/2 from Figure 1 from here: https://www.biorxiv.org/content/10.1101/011650v1.full.pdf

Counts_of_common_mapping_reads tells how many reads map simultaneously on both genes that form the fusion. It basically tells how similar from RNA sequence point of view are that genes the form the fusion. If the two genes have very high similarity at sequence level (like for example a gene and its pseudogene) then for sure they do not form a fusion. The smaller the number here the more likely is that the fusion is real.