tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
843 stars 226 forks source link

HMM product modified to hypthetical proteins #504

Closed sxh1136 closed 4 years ago

sxh1136 commented 4 years ago

Hi there,

I have some contigs that I suspect are of phage origin and I'm annotating them with prokka and the pVOG HMM database. Any hits I get from HMMER appear to be modified to hypothetical prtoein rather than their name e.g. VOG2348.

Log file section looks like this [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein [18:07:04] Modify product: => hypothetical protein

The HMM file I'm using looks like this: HMMER3/f [3.1b2 | February 2015] NAME VOG0001 LENG 98 ALPH amino RF no MM no CONS yes CS no MAP yes DATE Fri Sep 16 16:19:54 2016 NSEQ 16 EFFN 1.199219 CKSUM 1533860583 STATS LOCAL MSV -9.4991 0.71807 STATS LOCAL VITERBI -10.2654 0.71807 STATS LOCAL FORWARD -3.9728 0.71807

I notice it doesnt have a DESC line, is this what I need to add to be able to get proper annotation?

Thanks in advance!

tseemann commented 4 years ago

Yes you need a DESC line to get an annotation.

See bottom of this section: https://github.com/tseemann/prokka#fasta-database-format


The same description lines apply to HMM models, except the "NAME" and "DESC" fields are used:

NAME  PRK00001
ACC   PRK00001
DESC  2.1.1.48~~~ermC~~~rRNA adenine N-6-methyltransferase~~~COG1234
LENG  284