icbi-lab / NeoFuse

NeoFuse is a user-friendly pipeline for the prediction of fusion neoantigens from tumor RNA-seq data.
GNU General Public License v3.0
17 stars 9 forks source link

Question about premature stop codon of fusion transcript #18

Closed whiffen-cann closed 2 years ago

whiffen-cann commented 2 years ago

What's the impact of PMSCs in fusion translation in detail ? Because of PMSCs has been detected out clearly , is the peptide with PMSC information in NeoFuse_filtered.tsv file a product of fusion transcript terminating at PMSCs ? or the peptide is a raw translation of fusion transcript without any PMSCs influence ?

abyssum commented 2 years ago

Hello @whiffen-cann,

The correct answer would be the peptide with PMSC information in NeoFuse_filtered.tsv file a product of fusion transcript terminating at PMSCs. You can also check the output files under /sample_ID/Arriba/ where you can review the translated fusion protein sequence. An asterisk (*) signifies that there was a premature/early stop codon introduced to transcript 2 (meaning the transcript after the junction point). When no start codon could be found in the 5' gene or when there is a stop codon prior to the fusion junction, the transcripts are not translated. The resulting translated fusion protein sequence is the product of this termination signal and NeoFuse will report this under the Stop_Codon column (yes/no).

As a brief example, I am attaching the results (concerning only the PMSC) from a local run, sample_filtered.tsv (NeoFuse output):

Fusion | Gene1 | Gene2 | Fusion_Peptide | Stop_Codon
BCAS4-BCAS3 | BCAS4 | BCAS3 | FLTPDPGAEV | yes

sample.fusions.tsv (Arriba output)

 #gene1 gene2   fusion_transcript   peptide_sequence
BCAS4   BCAS3 ACGGGCTCCCAGGCAGCCTCCGCCAGCCGGACCCCGTCGCCCTCCTGATGCTGCTCGTGGACGCTGATCAGCCGGAGCCCATGCGCAGCGGGGCGCGCGAGCTCGCGCTCTTCCTGACCCCCGAtCCTGGGGCCGAG|GTACCTTTGACAGGAGCGTGACCCTGCTGGAGGTGTGCGGGAGCTGGCCTGAGGGCTTCGGGCTGCGGCACATGTCCTCCATGGAGCACACGGAGGAGGGCCTCCGGGAGCGACTTGCCGACGCCATGGCCG    GLPGSLRQPDPVALLMLLVDADQPEPMRSGARELALFLTPdPGAE|vpltga*

In the example above, the peptide FLTPDPGAEV is the product of the GLPGSLRQPDPVALLMLLVDADQPEPMRSGARELALFLTPdPGAE|vpltga* protein sequence which includes the early stop codon *

I hope this answers your question.

whiffen-cann commented 2 years ago

Thank you very much for your meticulousness. Although I didn't ask very detailed questions , but you really hit all my questions on various levels.

abyssum commented 2 years ago

I am glad I could help.

whiffen-cann commented 1 year ago

@abyssum long time no see , miss you so much !