Closed samlipworth closed 3 years ago
Hi, yes a way to reproduce the issue would be helpful, thanks
emailed to you - thanks
Received them thanks! Will reply as soon as I have a bit of time
The problem is that your Fasta file header should have the same sequence id as the one found in the GFF file:
$ head -n 1 R00000049.fasta
>R00000049 gi|150953431|gb|CP000647.1| Klebsiella pneumoniae subsp. pneumoniae MGH 78578, complete sequence
$ head -n 3 R00000049.gff
##gff-version 3
##sequence-region gnl|X|CKMJCFNP_1 1 5315120
gnl|X|CKMJCFNP_1 prokka gene 340 2802 . + . ID=CKMJCFNP_00001_gene;Name=thrA;gene=thrA;locus_tag=CKMJCFNP_00001
If you change the FASTA header to: >gnl|X|CKMJCFNP_1 gi|150953431|gb|CP000647.1| Klebsiella pneumoniae subsp. pneumoniae MGH 78578, complete sequence
the annotation should work.
Ah that makes sense - thankyou very much.
Trying to annotate significant unitigs.
I annotated a genome (complete) using prokka then used the fasta and gff with annotate_hits_pyseer
this works and the script runs ok but there are no gene names in the output file. Can email relevant files if that helps? Many thanks, Sam