tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
813 stars 222 forks source link

problem with remote repositories // prokka #365

Closed mggrami closed 5 years ago

mggrami commented 5 years ago

Hello professor, i cannot figure out how to run prokka against online repositories. Instead im fetching genomes (using BLAST to assess similarity) by accession numbers:

fetch_genome_by_accession.sh -a KT932701.1 -o f6

than combining them and adding to genus dir:

prokka-genbank_to_fasta_db .gbk > Coccus.faa cd-hit -i Coccus.faa -o Coccus -T 0 -M 0 -g 1 -s 0.8 -c 0.9 rm -fv Coccus.faa Coccus.bak.clstr Coccus.clstr makeblastdb -dbtype prot -in Coccus mv Coccus.p /path/to/prokka/db/genus/

and than running command:

prokka F8-contigs.fa --outdir pF8 --prefix f8 --evalue 0.01 --kingdom Viruses --gcode 11 --metagenome --locustag 245 --strain 245 --genus F8 --usegenus --force

is there better and more efficient method to do that? My main goal is to incorporate more online databases rather than downloading all the necessary genomes but im not sure how to set them up.

Greetings from Poland Michal

mggrami commented 5 years ago

I dont know if I phrased my problem correctly. When I submit sequence for prokka annotation and I dont specify genus db Im getting nearly all hypotetical proteins (when i specify db im getting hits). And if it does find the proper sequence its from Uniport. So does any one know how to expand the online db search? How to set this up so the program can do a broad search? Thanks

tseemann commented 5 years ago

The best way to add databases is to download the Genbank file eg.genomes.gbk file and provide it to Prokka via the --proteins genomes.gbk option.