Bakta and various standard output formats (Genbank, EMBL, GFF3) use slightly different terms and approaches how to declare truncated genes and pseudogenes.
In Bakta, a feature is declared as truncated if there is information from a downstream analysis tool, e.g. Pyrodigal, Infernal, etc.
Besides these, Bakta accepts true pseudogenes from tRNAscan-SE and from its own internal CDS workflow.
To strictly follow INSDC specs, for Genbank, EMBL and GFF3 output files (#330), Bakta now declares all truncated features as pseudo reflecting technical issues like sequencing and assembly errors on the one side, and true pseudogenes on the other side emerging from biological pseudogenization events like InDels and mutations.
Internally, Bakta uses truncated and pseudogene attributes to reflect the different states. In the human readable TSV output file (meant for a quick glimpse), Bakta adds feature product prefixes (pseudo), (truncated), (5' truncated) and (3' truncated)`.
Bakta and various standard output formats (Genbank, EMBL, GFF3) use slightly different terms and approaches how to declare truncated genes and pseudogenes.
In Bakta, a feature is declared as truncated if there is information from a downstream analysis tool, e.g. Pyrodigal, Infernal, etc.
Besides these, Bakta accepts true pseudogenes from tRNAscan-SE and from its own internal CDS workflow.
To strictly follow INSDC specs, for Genbank, EMBL and GFF3 output files (#330), Bakta now declares all truncated features as
pseudo
reflecting technical issues like sequencing and assembly errors on the one side, and truepseudogenes
on the other side emerging from biological pseudogenization events like InDels and mutations.Internally, Bakta uses
truncated
andpseudogene
attributes to reflect the different states. In the human readableTSV
output file (meant for a quick glimpse), Bakta adds feature product prefixes(pseudo)
,(truncated)
,(5' truncated)
and (3' truncated)`.