ylab-hi / ScanExitron

A computational workflow for exitron splicing identification
MIT License
12 stars 6 forks source link

How to generate exitron-derived frameshift peptide sequences #10

Closed ptnaimelmm closed 4 months ago

ptnaimelmm commented 2 years ago

Hi Carlos,

Would you please share how you generate the protein sequences derived from frameshift caused by exitron events? You can not access exitron derived peptides from MS raw data if you do not have these variant peptide sequences in the protein database for MSGFplus searching in cptac.py application. Did I understand it correctly? Thanks.

dolittle007 commented 2 years ago

You are correct. You can try to use ScanNeo to generate predicted exitron-derived peptides. The input is a VEP annotated VCF file.

ScanNeo.py fasta -i input.vcf -o output.fasta
ptnaimelmm commented 2 years ago

Hi Carlos,

Thank you so much for adding this new function here. It seems this program only generates protein sequences from exitron-contained isoforms using the original open reading frame. Do you think it will be better if we translate these variant RNAs in 3 frames, as I mentioned many predicted neopeptides come from other translation frame. Thanks.

dolittle007 commented 2 years ago

You are right. By definition, exitrons are within annotated coding exons. So exitron-derived peptides should be using the start codon of the transcript with the exitrons.