I have managed to successfully run both braker_RNA and braker_proteins as part of my structure annotation pipeline.
I would like to now translate the output of TSEBRA (tesbra.gtf) to the standard gff3, protein and cds files.
These are the steps that I run:
rename_gtf.py --gtf tsebra.gtf --out tsebra_renamed.gtf
gtf2gff.pl < tsebra_renamed.gtf --out=tsebra_renamed.gff3 --gff3 --printExon
getAnnoFasta.pl tsebra_renamed.gtf --seqfile=genome.fa --chop_cds
gtf2aa.pl genome.fa tsebra_renamed.gtf tsebra_renamed.aa
But I am encountering the problem that proteins predicted contain In-Frame Stop Codons, which is something that we do not want.
Could you please help me and tell me how can I fix this?
Hello all,
I have managed to successfully run both braker_RNA and braker_proteins as part of my structure annotation pipeline. I would like to now translate the output of TSEBRA (tesbra.gtf) to the standard gff3, protein and cds files.
These are the steps that I run: rename_gtf.py --gtf tsebra.gtf --out tsebra_renamed.gtf gtf2gff.pl < tsebra_renamed.gtf --out=tsebra_renamed.gff3 --gff3 --printExon getAnnoFasta.pl tsebra_renamed.gtf --seqfile=genome.fa --chop_cds gtf2aa.pl genome.fa tsebra_renamed.gtf tsebra_renamed.aa
But I am encountering the problem that proteins predicted contain In-Frame Stop Codons, which is something that we do not want.
Could you please help me and tell me how can I fix this?
Thank you,
Nathaly