gpertea / gffread

GFF/GTF utility providing format conversions, region filtering, FASTA sequence extraction and more
MIT License
378 stars 39 forks source link

problem with stopcodons in-frame encoding selenocysteine #113

Open MiSchwabe opened 2 years ago

MiSchwabe commented 2 years ago

Hi,

I have a problem to extract the proper faa sequences if selenocysteine (U) is encoded by TGA/TAG in the CDS. Bakta was used for the annotation and my colleague only saved the GFF3, GBK and genome file. So I used gffread (-y) to get the protein sequnces. This resulted in a "." in the middle of mutliple preotein sequences. Is there any option to prevent this?

Best, Michael