I've searched through the issues so hopefully this hasn't been mentioned before. It seems that Prokka run's into problems when a fasta header starts with >0.
In this case it renames the sequence header as SEQ in the annotations of the output gff file but does not rename the sequence in the fasta section of the gff file. This can lead to downstream programs skipping these annotations.
I've copied an example below and attached the corresponding input fasta file along with the output gff file.
##gff-version 3
##sequence-region 0 1 526811
##sequence-region 1 1 500965
SEQ Prodigal:002006 CDS 25 351 . + 0 ID=AAJEMFJL_00001;inference=ab initio prediction:Prodigal:002006;locus_tag=AAJEMFJL_00001;product=unannotated protein
SEQ Prodigal:002006 CDS 409 747 . + 0 ID=AAJEMFJL_00002;inference=ab initio prediction:Prodigal:002006;locus_tag=AAJEMFJL_00002;product=unannotated protein
SEQ Prodigal:002006 CDS 753 2168 . - 0 ID=AAJEMFJL_00003;inference=ab initio prediction:Prodigal:002006;locus_tag=AAJEMFJL_00003;product=unannotated protein
.
.
.
.
##FASTA
>0
TACAACCTGCTGTTGGTGTCGCGTATGAAAGAAGAGCTGGGTGCCGGTATCAATACGGGC
ATCATTCGAGCGATGGGTGGGACCGGCAAAGTGGTCACCTCGGCGGGTCTGGTCTTCGCG
I'm using Prokka v1.14.6 and ran the command prokka --noanno 11861_7#10.fa
Hi,
I've searched through the issues so hopefully this hasn't been mentioned before. It seems that Prokka run's into problems when a fasta header starts with
>0
.In this case it renames the sequence header as
SEQ
in the annotations of the output gff file but does not rename the sequence in the fasta section of the gff file. This can lead to downstream programs skipping these annotations.I've copied an example below and attached the corresponding input fasta file along with the output gff file.
I'm using Prokka v1.14.6 and ran the command
prokka --noanno 11861_7#10.fa
test.zip