Xinglab / IRIS

IRIS: Isoform peptides from RNA splicing for Immunotherapy target Screening
Other
24 stars 9 forks source link

Translation when there is more than one reading frame #24

Open beaferbl opened 4 months ago

beaferbl commented 4 months ago

Hi, I am using IRIS translate for this SE event: ENSG00000162572:SCNN1D:chr1:+:1220951:1221044:1219470:1221306

The output in the prot.fa file is the following:

chr1:1219470|1221306:uniprotFrame:+:form:skp(+)|chr1:1219470|1221306:uniprotFrame:+:form:skp(+) AATRGGSHLQlQPRRPPGRG chr1:1219470|1220951:uniprotFrame:+:form:inc1(+)|chr1:1219470|1220951:uniprotFrame:+:form:inc1(+) AATRGGSHLQsPGPVAPQRP chr1:1221044|1221306:uniprotFrame:+:form:inc2(+)|chr1:1221044|1221306:uniprotFrame:+:form:inc2(+) QHNAACKQGQlQPRRPPGRG

In the uniprot2gtf.blastout.uniprotAll.txt I can see two coding regions that match the left coordinate of the skp and inc1 junctions: chr1:1219357-1219470 and chr1:1219404-1219470. These have different reading frames, but IRIS only translates the junction according to the last one. Why does it happen?

Thanks, Bea

EricKutschera commented 4 months ago

The code tries to find the sequence to translate based on https://github.com/Xinglab/IRIS/blob/v2.0.1/IRIS/data/uniprot2gtf.blastout.uniprotAll.txt. Once if finds a valid value it stops looking and there is a comment mentioning that only 1 starting point is used: https://github.com/Xinglab/IRIS/blob/v2.0.1/IRIS/IRIS_translation.py#L80

You could try using IRIS translate --all-reading-frames to use all reading frames: https://github.com/Xinglab/IRIS/blob/v2.0.1/bin/IRIS#L387

beaferbl commented 4 months ago

Okay, I see. Thank you!