tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
822 stars 224 forks source link

Incorrect annotations from custom database #558

Open jellila opened 3 years ago

jellila commented 3 years ago

Hi!

I am trying to annotate some genomes from a custom database using --proteins and a gbk file with trusted annotations from an NCBI-annotated genome (let's call it X). However, some of the annotations in the Prokka output do not correspond to the original annotations even if my input in Prokka is the fasta file of X itself. The following steps I want to do are: roary to get the pangenome of my strains and use an R script to find out what genes are present/absent in different clusters of strains. After these steps the genes which come up as 'unique' to strain X, do not show up in the original X gbk file. I have no idea what's going wrong. Does anyone have any suggestions? Thanks!

Cheers,

Laura

AnaAndrews commented 2 years ago

I have the same problem. It seems to happen more when I loop over several .fasta files Also, there often is a "note" or something somewhere above the translation in the .gbk file with the annotation based on the custom database whilst the product is still "hypothetical protein" (i.e. /product="hypothetical protein").