Closed wksmits closed 9 years ago
@wksmits This bug is part of the "tbl2asn" tool which NCBI provides, and Prokka uses to the write the .GBK file. I have emailed NCBI without luck.
I should probably write a post-processing script in Prokka to "fix" the Genbank file, which could also add proper VERSION and ACCESSION values I also have difficulty coaxing "tbl2asn" to do what I want.
This still isn't fixed in the latest tbl2asn, so I've added a sed
filter to fix it myself.
Good to know this is fixed in Prokka 1.11 :)
I just hit this with GenBank output from Prokka 1.10 using the EMBL flat file validator embl-client.jar
dated 2015/09/10 (which has no version information) from ftp://ftp.ebi.ac.uk/pub/databases/ena/lib/ as documented on http://www.ebi.ac.uk/ena/software/flat-file-validator (presumably a renamed version of EnaValidator.jar
).
The sed fix copied from https://github.com/tseemann/prokka/commit/94d1b057aafecde55457d947c9a52e8ab7dec494 resolved this error:
ERROR: Feature qualifier "inference" does not contain one of the permitted values - " profile" is not permitted (QualifierCheck-4) line: 21 of sample.gbk
from line 21 in my file:
/inference="COORDINATES: profile:Aragorn:1.2"
which should be:
/inference="COORDINATES:profile:Aragorn:1.2"
I have been using Prokka to annotate de novo generated whole genome sequences of bacteria, based on species or a trusted database of proteins. I use the GBK output of Prokka to import the genome sequence into Artemis, where I do tweaks to the annotation, such as missed pseudogenes, for instance. I save the files as EMBL flat files for submission to ENA/SRA. Before submission I run the EnaValidator.jar to check for issues with the EMBL file. During these checks, it gives an error that turns out to be because of a space after the " COORDINATES: " qualifier. When I remove this in Artemis manually, the error is gone. I don't know where in the Prokka pipeline this space gets inserted, but it would be helpful to fix this (if possible).