tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
822 stars 224 forks source link

Prokka halts at annotation of CDS #522

Open beherrm opened 3 years ago

beherrm commented 3 years ago

Hello,

I am having issues annotating a sequences isolate. I aligned by sequences using shovill and obtained a contigs.fa file. When I run prokka to annotate, prokka seems to halt right when annotation of CDS occurs. (see below) Note: I am running Ubuntu as a WSL on Windows 10. I also recently used prokka 2 days ago and there was no issue. I also checked for prokka updates and it seems I have the latest version.

Could you please advise? The only difference is that I used the --trim tool in shovill with this isolate, and not the other.

(base) beherrm@BeaBombastic:~$ prokka --outdir Sinf_annotate --prefix Sinf_590 /home/beherrm/Sinf_spade/Sinf_590_contigs.fa [16:15:56] This is prokka 1.14.6 [16:15:56] Written by Torsten Seemann torsten.seemann@gmail.com [16:15:56] Homepage is https://github.com/tseemann/prokka [16:15:56] Local time is Mon Oct 12 16:15:56 2020 [16:15:56] You are beherrm [16:15:56] Operating system is linux [16:15:56] You have BioPerl 1.007002 [16:15:56] System has 8 cores. [16:15:56] Will use maximum of 8 cores. [16:15:56] Annotating as >>> Bacteria <<< [16:15:56] Generating locus_tag from '/home/beherrm/Sinf_spade/Sinf_590_contigs.fa' contents. [16:15:56] Setting --locustag MJHMNPEN from MD5 631679e7db646ec06255b90ed1fb7790 [16:15:56] Creating new output folder: Sinf_annotate [16:15:56] Running: mkdir -p Sinf_annotate [16:15:56] Using filename prefix: Sinf_590.XXX [16:15:56] Setting HMMER_NCPU=1 [16:15:56] Writing log to: Sinf_annotate/Sinf_590.log [16:15:56] Command: /home/linuxbrew/.linuxbrew/bin/prokka --outdir Sinf_annotate --prefix Sinf_590 /home/beherrm/Sinf_spade/Sinf_590_contigs.fa [16:15:56] Appending to PATH: /home/linuxbrew/.linuxbrew/Cellar/prokka/1.14.6/bin [16:15:56] Looking for 'aragorn' - found /home/linuxbrew/.linuxbrew/bin/aragorn [16:15:56] Determined aragorn version is 001002 from 'ARAGORN v1.2.38 Dean Laslett' [16:15:56] Looking for 'barrnap' - found /home/linuxbrew/.linuxbrew/bin/barrnap [16:15:56] Determined barrnap version is 000009 from 'barrnap 0.9' [16:15:56] Looking for 'blastp' - found /home/beherrm/miniconda3/bin/blastp [16:15:56] Determined blastp version is 002009 from 'blastp: 2.9.0+' [16:15:56] Looking for 'cmpress' - found /home/linuxbrew/.linuxbrew/bin/cmpress [16:15:56] Determined cmpress version is 001001 from '# INFERNAL 1.1.3 (Nov 2019)' [16:15:56] Looking for 'cmscan' - found /home/linuxbrew/.linuxbrew/bin/cmscan [16:15:56] Determined cmscan version is 001001 from '# INFERNAL 1.1.3 (Nov 2019)' [16:15:56] Looking for 'egrep' - found /bin/egrep [16:15:56] Looking for 'find' - found /usr/bin/find [16:15:56] Looking for 'grep' - found /bin/grep [16:15:56] Looking for 'hmmpress' - found /home/linuxbrew/.linuxbrew/bin/hmmpress [16:15:56] Determined hmmpress version is 003003 from '# HMMER 3.3.1 (Jul 2020); http://hmmer.org/' [16:15:56] Looking for 'hmmscan' - found /home/linuxbrew/.linuxbrew/bin/hmmscan [16:15:56] Determined hmmscan version is 003003 from '# HMMER 3.3.1 (Jul 2020); http://hmmer.org/' [16:15:56] Looking for 'java' - found /home/linuxbrew/.linuxbrew/bin/java [16:15:56] Looking for 'makeblastdb' - found /home/beherrm/miniconda3/bin/makeblastdb [16:15:56] Determined makeblastdb version is 002009 from 'makeblastdb: 2.9.0+' [16:15:56] Looking for 'minced' - found /home/linuxbrew/.linuxbrew/bin/minced [16:15:57] Determined minced version is 004002 from 'minced 0.4.2' [16:15:57] Looking for 'parallel' - found /home/beherrm/miniconda3/bin/parallel [16:15:57] Determined parallel version is 20200322 from 'GNU parallel 20200322' [16:15:57] Looking for 'prodigal' - found /home/linuxbrew/.linuxbrew/bin/prodigal [16:15:57] Determined prodigal version is 002006 from 'Prodigal V2.6.3: February, 2016' [16:15:57] Looking for 'prokka-genbank_to_fasta_db' - found /home/linuxbrew/.linuxbrew/bin/prokka-genbank_to_fasta_db [16:15:57] Looking for 'sed' - found /bin/sed [16:15:57] Looking for 'tbl2asn' - found /home/linuxbrew/.linuxbrew/bin/tbl2asn [16:15:57] Determined tbl2asn version is 025008 from 'tbl2asn 25.8 arguments:' [16:15:57] Using genetic code table 11. [16:15:57] Loading and checking input file: /home/beherrm/Sinf_spade/Sinf_590_contigs.fa [16:15:57] Wrote 167 contigs totalling 4975456 bp. [16:15:57] Predicting tRNAs and tmRNAs [16:15:57] Running: aragorn -l -gc11 -w Sinf_annotate\/Sinf_590.fna [16:16:00] 1 tRNA-Ser [41192,41279] 35 (tga) [16:16:00] 2 tRNA-Ser c[90235,90322] 35 (gga) [16:16:00] 3 tRNA-Arg [197374,197450] 35 (cct) [16:16:00] 4 tRNA-SeC [197453,197521] 32 (tca) [16:16:00] 5 tRNA-Arg [209814,209890] 35 (tct) [16:16:00] 6 tRNA-Gln c[219310,219403] 32 (ttg) [16:16:00] 7 tRNA-Val c[363259,363335] 35 (gac) [16:16:00] 8 tRNA-Val c[363346,363422] 35 (gac) [16:16:00] 9 tRNA-Tyr [719440,719524] 35 (gta) [16:16:00] 10 tRNA-Tyr [719729,719813] 35 (gta) [16:16:00] 11 tRNA-Leu c[903090,903176] 35 (taa) [16:16:00] 12 tRNA-Cys c[903188,903261] 33 (gca) [16:16:00] 13 tRNA-Gly c[903314,903389] 34 (gcc) [16:16:00] 14 tRNA-Ser c[949180,949269] 35 (cga) [16:16:00] 15 tRNA-Asn [950256,950331] 34 (gtt) [16:16:00] 16 tRNA-Asn [951225,951300] 34 (gtt) [16:16:00] 17 tRNA-Asn c[960298,960373] 34 (gtt) [16:16:00] 18 tRNA-Asn [962131,962206] 34 (gtt) [16:16:00] 19 tRNA-Pro [1188012,1188088] 35 (ggg) [16:16:00] 1 tRNA-Leu [328414,328500] 35 (cag) [16:16:00] 2 tRNA-Leu [328528,328614] 35 (cag) [16:16:00] 3 tRNA-Leu [328646,328732] 35 (cag) [16:16:00] 4 tRNA-Leu c[376244,376328] 35 (caa) [16:16:00] 1 tRNA-Arg c[18,94] 35 (acg) [16:16:00] 2 tRNA-Ser c[98,190] 35 (gct) [16:16:00] 3 tRNA-Met [170682,170758] 35 (cat) [16:16:00] 4 tRNA-Met [170788,170864] 35 (cat) [16:16:00] 5 tRNA-Gly c[235503,235576] 33 (ccc) [16:16:00] 6 tRNA-Phe [313934,314009] 34 (gaa) [16:16:00] 1 tRNA-Lys c[218005,218080] 34 (ttt) [16:16:00] 2 tRNA-Val c[218085,218160] 34 (tac) [16:16:00] 3 tRNA-Val c[218204,218279] 34 (tac) [16:16:00] 4 tRNA-Val c[218325,218400] 34 (tac) [16:16:00] 5 tRNA-Ala [221106,221181] 34 (ggc) [16:16:00] 6 tRNA-Ala [221224,221299] 34 (ggc) [16:16:00] 1 tRNA-Gln c[113035,113109] 33 (ctg) [16:16:00] 2 tRNA-Gln c[113153,113227] 33 (ctg) [16:16:00] 3 tRNA-Met c[113274,113350] 35 (cat) [16:16:00] 4 tRNA-Gln c[113366,113440] 33 (ttg) [16:16:00] 5 tRNA-Gln c[113476,113550] 33 (ttg) [16:16:00] 6 tRNA-Leu c[113574,113658] 35 (tag) [16:16:00] 7 tRNA-Met c[113667,113743] 35 (cat) [16:16:00] 8 tRNA-Lys [181283,181358] 34 (ttt) [16:16:00] 9 tRNA-Val [181495,181570] 34 (tac) [16:16:00] 10 tRNA-Lys [181574,181649] 34 (ttt) [16:16:00] 11 tRNA-Lys [181792,181867] 34 (ttt) [16:16:00] 1 tRNA-Gly c[66565,66640] 34 (gcc) [16:16:00] 2 tRNA-Gly c[66677,66752] 34 (gcc) [16:16:00] 3 tRNA-Gly c[66790,66865] 34 (gcc) [16:16:00] 4 tRNA-Phe [96623,96698] 34 (gaa) [16:16:00] 1 tRNA-SeC c[59892,59986] 35 (tca) [16:16:00] 2 tRNA-Pro [191161,191237] 35 (cgg) [16:16:00] 1 tRNA-Asp [278,354] 35 (gtc) [16:16:00] 2 tRNA-Asp [9718,9794] 35 (gtc) [16:16:00] 3 tRNA-Thr [85024,85099] 34 (cgt) [16:16:00] 1 tRNA-Arg [153313,153387] 34 (cct) [16:16:00] 1 tRNA-Arg c[1728,1804] 35 (tct) [16:16:00] 1 tRNA-Ser [20574,20661] 35 (gga) [16:16:00] 1 tRNA-Leu [8541,8627] 35 (gag) [16:16:00] 2 tRNA-Met [11800,11876] 35 (cat) [16:16:00] 3 tRNA-Met c[84465,84540] 34 (cat) [16:16:00] 1 tRNA-Pro c[53084,53160] 35 (tgg) [16:16:00] 2 tRNA-Leu c[53203,53289] 35 (cag) [16:16:00] 3 tRNA-His c[53310,53385] 34 (gtg) [16:16:00] 4 tRNA-Arg c[53440,53516] 35 (ccg) [16:16:00] 5 tRNA-Trp c[89646,89721] 34 (cca) [16:16:00] 6 tRNA-Asp c[89730,89806] 35 (gtc) [16:16:01] 1 tmRNA [42418,42780] 91,126 ANDENYALAA [16:16:01] 1 tRNA-Arg [54248,54324] 35 (tcg) [16:16:01] 2 tRNA-Arg c[54264,54340] 35 (acg) [16:16:01] tRNA c[4246,4322] is a pseudo/wacky gene - skipping. [16:16:01] 1 tRNA-Thr c[370,445] 34 (ggt) [16:16:01] 2 tRNA-Gly c[452,526] 34 (tcc) [16:16:01] 3 tRNA-Tyr c[643,727] 35 (gta) [16:16:01] 4 tRNA-Thr c[736,811] 34 (tgt) [16:16:01] 1 tRNA-Ala c[238,313] 34 (tgc) [16:16:01] 2 tRNA-Ile c[425,501] 35 (gat) [16:16:01] 1 tRNA-Glu [132,207] 35 (ttc) [16:16:01] 1 tRNA-Thr [102,177] 34 (ggt) [16:16:01] 1 tRNA-Arg c[23,99] 35 (acg) [16:16:01] 2 tRNA-Arg c[161,237] 35 (acg) [16:16:01] 1 tRNA-Glu c[20,95] 35 (ttc) [16:16:01] 1 tRNA-Arg c[2,78] 35 (acg) [16:16:01] 1 tRNA-Arg c[99,175] 35 (acg) [16:16:01] 1 tRNA-Arg [-3,74] 35 (tcg) [16:16:01] 2 tRNA-Arg c[14,90] 35 (acg) [16:16:01] 1 tRNA-Arg [-3,74] 35 (tcg) [16:16:01] 2 tRNA-Arg c[14,90] 35 (acg) [16:16:01] Found 87 tRNAs [16:16:01] Predicting Ribosomal RNAs [16:16:01] Running Barrnap with 8 threads [16:16:02] 1 contig00004 401 5S ribosomal RNA [16:16:02] 2 contig00008 2 5S ribosomal RNA (partial) [16:16:02] 3 contig00014 91967 5S ribosomal RNA (partial) [16:16:02] 4 contig00026 4 5S ribosomal RNA (partial) [16:16:02] 5 contig00041 4370 5S ribosomal RNA (partial) [16:16:02] 6 contig00049 5 16S ribosomal RNA [16:16:02] 7 contig00073 2 5S ribosomal RNA (partial) [16:16:02] 8 contig00073 219 5S ribosomal RNA (partial) [16:16:02] 9 contig00114 4 5S ribosomal RNA [16:16:02] 10 contig00130 6 5S ribosomal RNA (partial) [16:16:02] 11 contig00131 89 5S ribosomal RNA (partial) [16:16:02] 12 contig00132 88 5S ribosomal RNA (partial) [16:16:02] 13 contig00152 50 5S ribosomal RNA (partial) [16:16:02] 14 contig00155 11 5S ribosomal RNA [16:16:02] 15 contig00164 6 5S ribosomal RNA (partial) [16:16:02] 16 contig00165 12 5S ribosomal RNA (partial) [16:16:02] Found 16 rRNAs [16:16:02] Skipping ncRNA search, enable with --rfam if desired. [16:16:02] Total of 102 tRNA + rRNA features [16:16:02] Searching for CRISPR repeats [16:16:03] CRISPR1 contig00003 106368 with 27 spacers [16:16:03] CRISPR2 contig00003 124262 with 27 spacers [16:16:03] Found 2 CRISPRs [16:16:03] Predicting coding sequences [16:16:03] Contigs total 4975456 bp, so using single mode [16:16:03] Running: prodigal -i Sinf_annotate\/Sinf_590.fna -c -m -g 11 -p single -f sco -q [16:16:09] Excluding CDS which overlaps existing RNA (tRNA) at contig00001:89855..90340 on - strand [16:16:09] Excluding CDS which overlaps existing RNA (tRNA) at contig00001:219069..220253 on - strand [16:16:09] Excluding CDS which overlaps existing RNA (repeat_region) at contig00003:107399..107863 on + strand [16:16:10] Excluding CDS which overlaps existing RNA (tRNA) at contig00007:59517..60023 on + strand [16:16:11] Excluding CDS which overlaps existing RNA (rRNA) at contig00049:104..358 on - strand [16:16:11] Found 4633 CDS [16:16:11] Connecting features back to sequences [16:16:11] Not using genus-specific database. Try --usegenus to enable it. [16:16:11] Annotating CDS, please be patient. [16:16:11] Will use 8 CPUs for similarity searching. [16:16:12] There are still 4633 unannotated CDS left (started with 4633) [16:16:12] Will use blast to search against /home/linuxbrew/.linuxbrew/Cellar/prokka/1.14.6/db/kingdom/Bacteria/IS with 8 CPUs [16:16:12] Running: cat Sinf_annotate\/Sinf_590.IS.tmp.2491.faa | parallel --gnu --plain -j 8 --block 91406 --recstart '>' --pipe blastp -query - -db /home/linuxbrew/.linuxbrew/Cellar/prokka/1.14.6/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Sinf_annotate\/Sinf_590.IS.tmp.2491.blast 2> /dev/null [16:16:28] Could not run command: cat Sinf_annotate\/Sinf_590.IS.tmp.2491.faa | parallel --gnu --plain -j 8 --block 91406 --recstart '>' --pipe blastp -query - -db /home/linuxbrew/.linuxbrew/Cellar/prokka/1.14.6/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Sinf_annotate\/Sinf_590.IS.tmp.2491.blast 2> /dev/null**