tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
822 stars 224 forks source link

Can not run genus specific database #596

Open Aravindphoenix opened 2 years ago

Aravindphoenix commented 2 years ago

Here is my command:

prokka Z10248.fasta --prefix Z10248spa --genus spa --usegenus spa [15:03:34] This is prokka 1.14.6 [15:03:34] Written by Torsten Seemann torsten.seemann@gmail.com [15:03:34] Homepage is https://github.com/tseemann/prokka [15:03:34] Local time is Fri Oct 15 15:03:34 2021 [15:03:34] You are mini [15:03:34] Operating system is linux [15:03:34] You have BioPerl 1.7.7 Argument "1.7.7" isn't numeric in numeric lt (<) at /home/mini/prokka/bin/prokka line 259. [15:03:34] System has 4 cores. [15:03:34] Option --cpu asked for 8 cores, but system only has 4 [15:03:34] Will use maximum of 4 cores. [15:03:34] Annotating as >>> Bacteria <<< [15:03:34] Generating locus_tag from 'Z10248.fasta' contents. [15:03:34] Setting --locustag ACONMKOD from MD5 ac87648d32b27f918f970c86953691a4 [15:03:34] Creating new output folder: Z10248spa [15:03:34] Running: mkdir -p Z10248spa [15:03:34] Using filename prefix: Z10248spa.XXX [15:03:34] Setting HMMER_NCPU=1 [15:03:34] Writing log to: Z10248spa/Z10248spa.log [15:03:34] Command: /home/mini/prokka/bin/prokka Z10248.fasta --prefix Z10248spa --genus spa --usegenus spa [15:03:34] Appending to PATH: /home/mini/prokka/bin [15:03:34] Looking for 'aragorn' - found /usr/local/bin/aragorn [15:03:35] Determined aragorn version is 001002 from 'ARAGORN v1.2.38 Dean Laslett' [15:03:35] Looking for 'barrnap' - found /usr/bin/barrnap [15:03:35] Determined barrnap version is 000009 from 'barrnap 0.9' [15:03:35] Looking for 'blastp' - found /usr/local/bin/blastp [15:03:36] Determined blastp version is 002012 from 'blastp: 2.12.0+' [15:03:36] Looking for 'cmpress' - found /usr/bin/cmpress [15:03:36] Determined cmpress version is 001001 from '# INFERNAL 1.1.3 (Nov 2019)' [15:03:36] Looking for 'cmscan' - found /usr/bin/cmscan [15:03:37] Determined cmscan version is 001001 from '# INFERNAL 1.1.3 (Nov 2019)' [15:03:37] Looking for 'egrep' - found /usr/bin/egrep [15:03:37] Looking for 'find' - found /usr/bin/find [15:03:37] Looking for 'grep' - found /usr/bin/grep [15:03:37] Looking for 'hmmpress' - found /usr/bin/hmmpress [15:03:37] Determined hmmpress version is 003003 from '# HMMER 3.3 (Nov 2019); http://hmmer.org/' [15:03:37] Looking for 'hmmscan' - found /usr/bin/hmmscan [15:03:37] Determined hmmscan version is 003003 from '# HMMER 3.3 (Nov 2019); http://hmmer.org/' [15:03:37] Looking for 'java' - found /usr/bin/java [15:03:37] Looking for 'makeblastdb' - found /usr/local/bin/makeblastdb [15:03:38] Determined makeblastdb version is 002012 from 'makeblastdb: 2.12.0+' [15:03:38] Looking for 'parallel' - found /usr/local/bin/parallel [15:03:39] Determined parallel version is 20210922 from 'GNU parallel 20210922' [15:03:39] Looking for 'prodigal' - found /usr/bin/prodigal [15:03:39] Determined prodigal version is 002006 from 'Prodigal V2.6.3: February, 2016' [15:03:39] Looking for 'prokka-genbank_to_fasta_db' - found /home/mini/prokka/bin/prokka-genbank_to_fasta_db [15:03:39] Looking for 'sed' - found /usr/bin/sed [15:03:39] Looking for 'tbl2asn' - found /usr/bin/tbl2asn [15:03:39] Determined tbl2asn version is 025003 from 'tbl2asn 25.3 arguments:' [15:03:39] Using genetic code table 11. [15:03:39] Loading and checking input file: Z10248.fasta [15:03:39] Wrote 52 contigs totalling 4621171 bp. [15:03:39] Predicting tRNAs and tmRNAs [15:03:39] Running: aragorn -l -gc11 -w Z10248spa\/Z10248spa.fna [15:03:41] 1 tRNA-Arg [38277,38353] 35 (cct) [15:03:41] 2 tRNA-Arg [48514,48590] 35 (tct) [15:03:41] 3 tRNA-Val c[191079,191155] 35 (gac) [15:03:41] 4 tRNA-Val c[191166,191242] 35 (gac) [15:03:41] 5 tRNA-Tyr [512160,512244] 35 (gta) [15:03:41] 6 tRNA-Tyr [512279,512363] 35 (gta) [15:03:41] 1 tRNA-Asp [453,529] 35 (gtc) [15:03:41] 2 tRNA-Asp [9893,9969] 35 (gtc) [15:03:41] 3 tRNA-Thr [86980,87055] 34 (cgt) [15:03:41] 4 tRNA-Arg [353995,354071] 35 (tct) [15:03:41] 5 tRNA-Gln c[462265,462339] 33 (ctg) [15:03:41] 6 tRNA-Gln c[462383,462457] 33 (ctg) [15:03:41] 7 tRNA-Met c[462504,462580] 35 (cat) [15:03:41] 8 tRNA-Gln c[462596,462670] 33 (ttg) [15:03:41] 9 tRNA-Gln c[462706,462780] 33 (ttg) [15:03:41] 10 tRNA-Leu c[462804,462888] 35 (tag) [15:03:41] 11 tRNA-Met c[462899,462975] 35 (cat) [15:03:41] 1 tRNA-Arg [149569,149643] 34 (cct) [15:03:41] 2 tRNA-Ala c[177437,177512] 34 (ggc) [15:03:41] 3 tRNA-Val [180218,180293] 34 (tac) [15:03:41] 4 tRNA-Val [180339,180414] 34 (tac) [15:03:41] 5 tRNA-Val [180456,180531] 34 (tac) [15:03:41] 6 tRNA-Lys [180536,180611] 34 (ttt) [15:03:41] 1 tRNA-Met c[52910,52986] 35 (cat) [15:03:41] 2 tRNA-Met c[53016,53092] 35 (cat) [15:03:41] 3 tRNA-Ser [225838,225930] 35 (gct) [15:03:41] 4 tRNA-Arg [225934,226010] 35 (acg) [15:03:41] 5 tRNA-Arg [226072,226148] 35 (acg) [15:03:41] 6 tRNA-Arg [226211,226287] 35 (acg) [15:03:41] 7 tRNA-Arg c[226227,226303] 35 (tcg) [15:03:41] 8 tmRNA c[317528,317890] 91,126 ANDENYALAA** [15:03:41] 1 tRNA-Gly c[2534,2607] 33 (ccc) [15:03:41] 2 tRNA-Phe [72997,73072] 34 (gaa) [15:03:41] 3 tRNA-Met [176954,177029] 34 (cat) [15:03:41] 4 tRNA-Met c[254750,254826] 35 (cat) [15:03:41] 5 tRNA-Leu c[256711,256797] 35 (gag) [15:03:41] 1 tRNA-Pro c[112855,112931] 35 (cgg) [15:03:41] 2 tRNA-SeC [235243,235337] 35 (tca) [15:03:41] 1 tRNA-Ser [70718,70805] 35 (gga) [15:03:41] 2 tRNA-Lys c[216658,216733] 34 (ttt) [15:03:41] 3 tRNA-Lys c[216869,216944] 34 (ttt) [15:03:41] 4 tRNA-Lys c[217087,217162] 34 (ttt) [15:03:41] 5 tRNA-Val c[217166,217241] 34 (tac) [15:03:41] 6 tRNA-Lys c[217378,217453] 34 (ttt) [15:03:41] 1 tRNA-Pro [224052,224128] 35 (ggg) [15:03:41] 1 tRNA-Phe c[144018,144093] 34 (gaa) [15:03:41] 2 tRNA-Gly [173948,174023] 34 (gcc) [15:03:41] 3 tRNA-Gly [174180,174255] 34 (gcc) [15:03:41] 4 tRNA-Gly [174412,174487] 34 (gcc) [15:03:41] 1 tRNA-Leu c[18622,18708] 35 (cag) [15:03:41] 2 tRNA-Leu c[18740,18826] 35 (cag) [15:03:41] 3 tRNA-Leu c[18854,18940] 35 (cag) [15:03:41] 4 tRNA-Arg [45559,45652] 38 (cct) [15:03:41] 1 tRNA-Ser c[48778,48865] 35 (gga) [15:03:41] 1 tRNA-Leu c[51822,51906] 35 (caa) [15:03:41] 1 tRNA-Asn c[19644,19719] 34 (gtt) [15:03:41] 2 tRNA-Asn [21477,21552] 34 (gtt) [15:03:41] 3 tRNA-Asn c[24909,24984] 34 (gtt) [15:03:41] 4 tRNA-Asn c[25532,25607] 34 (gtt) [15:03:41] 5 tRNA-Ser [26594,26683] 35 (cga) [15:03:41] 6 tRNA-Gly [72450,72525] 34 (gcc) [15:03:41] 7 tRNA-Cys [72578,72651] 33 (gca) [15:03:41] 8 tRNA-Leu [72663,72749] 35 (taa) [15:03:41] 1 tRNA-Arg [36631,36707] 35 (ccg) [15:03:41] 2 tRNA-His [36762,36837] 34 (gtg) [15:03:41] 3 tRNA-Leu [36858,36944] 35 (cag) [15:03:41] 4 tRNA-Pro [36987,37063] 35 (tgg) [15:03:41] 1 tRNA-Ser [31287,31374] 35 (tga) [15:03:41] 1 tRNA-Thr [16779,16853] 33 (tgt) [15:03:41] 1 tRNA-Thr c[407,482] 34 (ggt) [15:03:41] 2 tRNA-Gly c[489,563] 34 (tcc) [15:03:41] 3 tRNA-Tyr c[680,764] 35 (gta) [15:03:41] 4 tRNA-Thr c[773,848] 34 (tgt) [15:03:41] 1 tRNA-Asp [47,123] 35 (gtc) [15:03:41] 2 tRNA-Trp [132,207] 34 (cca) [15:03:41] 1 tRNA-Ile [21,97] 35 (gat) [15:03:41] 2 tRNA-Ala [207,282] 34 (tgc) [15:03:41] 1 tRNA-Thr c[104,179] 34 (ggt) [15:03:41] 1 tRNA-Glu [37,112] 35 (ttc) [15:03:41] Found 79 tRNAs [15:03:41] Predicting Ribosomal RNAs [15:03:41] Running Barrnap with 4 threads [15:03:41] 1 2 148 5S ribosomal RNA [15:03:41] 2 36 21 16S ribosomal RNA [15:03:41] Found 2 rRNAs [15:03:41] Skipping ncRNA search, enable with --rfam if desired. [15:03:41] Total of 80 tRNA + rRNA features [15:03:41] Predicting coding sequences [15:03:41] Contigs total 4621171 bp, so using single mode [15:03:41] Running: prodigal -i Z10248spa\/Z10248spa.fna -c -m -g 11 -p single -f sco -q [15:03:46] Excluding CDS which overlaps existing RNA (tRNA) at 5:72985..73125 on + strand [15:03:47] Excluding CDS which overlaps existing RNA (tRNA) at 12:45248..45895 on + strand [15:03:47] Excluding CDS which overlaps existing RNA (tRNA) at 14:51803..52021 on + strand [15:03:47] Excluding CDS which overlaps existing RNA (rRNA) at 36:1205..1459 on + strand [15:03:47] Found 4428 CDS [15:03:47] Connecting features back to sequences [15:03:47] Using custom Spa database for annotation [15:03:47] Annotating CDS, please be patient. [15:03:47] Will use 4 CPUs for similarity searching. [15:03:48] There are still 4428 unannotated CDS left (started with 4428) [15:03:48] Will use blast to search against /home/mini/prokka/db/genus/Spa with 4 CPUs [15:03:48] Running: cat Z10248spa\/Z10248spa.Spa.tmp.3285.faa | parallel --gnu --plain -j 4 --block 171620 --recstart '>' --pipe blastp -query - -db /home/mini/prokka/db/genus/Spa -evalue 1e-09 -qcov_hsp_perc 80 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Z10248spa\/Z10248spa.Spa.tmp.3285.blast 2> /dev/null

[15:03:52] Could not run command: cat Z10248spa\/Z10248spa.Spa.tmp.3285.faa | parallel --gnu --plain -j 4 --block 171620 --recstart '>' --pipe blastp -query - -db /home/mini/prokka/db/genus/Spa -evalue 1e-09 -qcov_hsp_perc 80 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Z10248spa\/Z10248spa.Spa.tmp.3285.blast 2> /dev/null

Without genus specific database I can run prokka and I can run following command without prokka:

cat Z10248spa\/Z10248spa.Spa.tmp.3285.faa | parallel --gnu --plain -j 4 --block 171620 --recstart '>' --pipe blastp -query - -db /home/mini/prokka/db/genus/Spa -evalue 1e-09 -qcov_hsp_perc 80 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Z10248spa\/Z10248spa.Spa.tmp.3285.blast 2> /dev/null

I install prokka with git clone, ncbi blast using source code, and other dependencies individually. And Prokka run fine without specify genus database and I get all output. Bt when I specify database, It wont run properly. it only happens when I specify genus database only. Otherwise everything fine... And I create my database as follows:

% prokka-genbank_to_fasta_db Spa1.gbk Spa2.gbk Spa3.gbk > Spa.faa % cd-hit -i Spa.faa -o Spa -T 0 -M 0 -g 1 -s 0.8 -c 0.9 % rm -fv Spa.faa Spa.bak.clstr Spa.clstr % makeblastdb -dbtype prot -in Spa % mv Coccus.p* ~/prokka/db/genus/

Thanks in advance...

Aravindphoenix commented 2 years ago

Sorry at last it would be % mv Spa.p ~/prokka/db/genus/ not % mv Coccus.p ~/prokka/db/genus/