tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
822 stars 224 forks source link

Error when annotating big assemblies file. #659

Open RunJiaJi opened 1 year ago

RunJiaJi commented 1 year ago

Hi sicentists,

I'm trying to annotate metagenomic assemblies which were directly assembled through Megahit. I have a big fasta file (318MB) which containing 64096 contigs. After Prokka annotation, I noted that some of the protein sequences were not properly annotated.

To better explain the situation, take one contig (>gnl|Prokka|LOCUSTAG_416) as example, some of the protein sequences were annotated and translated, while some of the sequences were not annotated and translated, which can be easily seen in the GenBank file (contig_LOCUSTAG_416_firstTimeAnno.gbk).

I further extracted the contig sequence from the big fasta file (contig_LOCUSTAG_416.fa) and reannotated using Prokka, surprisingly, all proteins were annotated and translated (contig_LOCUSTAG_416_secondTimeAnno.gbk).

Can someone explain why this error appears when annotate big files? Thanks in advance.

contig_LOCUSTAG_416.fa.txt contig_LOCUSTAG_416_firstTimeAnno.gbk.txt contig_LOCUSTAG_416_secondTimeAnno.gbk.txt