Closed liubovch closed 5 years ago
Thank you
Genus DBs - read https://github.com/tseemann/prokka#the-genus-databases
The ones I include are highly curated ones myself and my colleagues performed. It's much simpler to use --proteins your_annotations.gbk
these days - I recommend downloading the latest favourite annotation from Refseq and providing that to Prokka.
prokka-hamap_to_hmm
tool included. But HAMAP keep changing their formats. Please realise that these databases don't change as much as you might think. Our understanding of new protein functions is quite slow and incremental. I update the sprot
database regularly. This usually captures 50-70% of all annotations. The HAMAP is a secondary source. My HAMAP is Feb 19 2017 HAMAP.hmm
. I have this targetted for updating. Thank you for the reminder.
Also, if you want really high quality annotations that are NCBI Refseq standard, you can use PGAP now:
https://github.com/ncbi/pgap/wiki
It's slower but does a really good job. I wrote Prokka originally because PGAP (formerly PGAAP) was not open source, but that version is now :-)
I am sorry, I overlooked somehow your recent changes in README about genus DBs! Thank you a lot for your comments and suggestions!
Hi,
First of all, thanks for this great tool! It makes life easier!
My question is about default genus-specific DBs. They seem outdated as the last changes were made 5 years ago. The same is with HAMAP hmm profiles. I think they should be updated more often. Another option could be to include a warning for users so that they know about this problem and could consider building their own DBs.