I downloaded prokka using conda and am running it with a custom database (PHROGs: https://phrogs.lmge.uca.fr/) to annotate prokaryotic viruses. I am doing this on my university's HPC using a for loop. It seems to work fine for most (3930) of my files, but seems to be malfunctioning for a small (5) handful. Specifically, for this small handful, prokka runs through these files with no error, but creates empty outputs (.faa, etc.). It also never deletes several temporary files.
I've double checked the input .fasta files which lead to these empty outputs and they all look fine. Each is a standard .fasta (header starting with ">") which is readable, non-zero, and most definitely a file. They all also have one sequence per file and are fairly short (~5-10kb).
One of the problemed files: busby_phage_123.fa
The loop I run:
for f in *.fa
do
sbatch ../scripts/annot.sh $f
done
-rw-rw-r-- 1 dsbard dsbard 0 Jun 28 14:52 annotate_busby_phage_123.AMR.tmp.588077.faa
-rw-rw-r-- 1 dsbard dsbard 0 Jun 28 14:52 annotate_busby_phage_123.HAMAP.hmm.tmp.588077.faa
-rw-rw-r-- 1 dsbard dsbard 0 Jun 28 14:52 annotate_busby_phage_123.IS.tmp.588077.faa
-rw-rw-r-- 1 dsbard dsbard 0 Jun 28 14:52 annotate_busby_phage_123.all_phrogs.hmm.tmp.588077.faa
-rw-rw-r-- 1 dsbard dsbard 3.1K Jun 28 14:52 annotate_busby_phage_123.err
-rw-rw-r-- 1 dsbard dsbard 0 Jun 28 14:52 annotate_busby_phage_123.faa
-rw-rw-r-- 1 dsbard dsbard 0 Jun 28 14:52 annotate_busby_phage_123.ffn
-rw-rw-r-- 1 dsbard dsbard 1.9K Jun 28 14:52 annotate_busby_phage_123.fna
-rw-rw-r-- 1 dsbard dsbard 2.0K Jun 28 14:52 annotate_busby_phage_123.fsa
-rw-rw-r-- 1 dsbard dsbard 2.9K Jun 28 14:52 annotate_busby_phage_123.gbk
-rw-rw-r-- 1 dsbard dsbard 2.0K Jun 28 14:52 annotate_busby_phage_123.gff
-rw-rw-r-- 1 dsbard dsbard 8.6K Jun 28 14:52 annotate_busby_phage_123.log
-rw-rw-r-- 1 dsbard dsbard 0 Jun 28 14:52 annotate_busby_phage_123.sprot.tmp.588077.faa
-rw-rw-r-- 1 dsbard dsbard 3.0K Jun 28 14:52 annotate_busby_phage_123.sqn
-rw-rw-r-- 1 dsbard dsbard 25 Jun 28 14:52 annotate_busby_phage_123.tbl
-rw-rw-r-- 1 dsbard dsbard 53 Jun 28 14:52 annotate_busby_phage_123.tsv
-rw-rw-r-- 1 dsbard dsbard 55 Jun 28 14:52 annotate_busby_phage_123.txt
Log file contents:
[14:52:04] This is prokka 1.14.6
[14:52:04] Written by Torsten Seemann <torsten.seemann@gmail.com>
[14:52:04] Homepage is https://github.com/tseemann/prokka
[14:52:04] Local time is Wed Jun 28 14:52:04 2023
[14:52:04] You are dsbard
[14:52:04] Operating system is linux
[14:52:04] You have BioPerl 1.7.8
[14:52:04] System has 24 cores.
[14:52:04] Will use maximum of 8 cores.
[14:52:04] Annotating as >>> Bacteria <<<
[14:52:04] Generating locus_tag from 'busby_phage_123.fa' contents.
[14:52:04] Setting --locustag ELNPBALK from MD5 e579ba5461c138c2561e1d09ab83f040
[14:52:04] Creating new output folder: /home/dsbard/bee_phage/dls/network/annot/busby_phage_123
[14:52:04] Running: mkdir -p \/home\/dsbard\/bee_phage\/dls\/network\/annot\/busby_phage_123
[14:52:04] Using filename prefix: annotate_busby_phage_123.XXX
[14:52:04] Setting HMMER_NCPU=1
[14:52:04] Writing log to: /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.log
[14:52:04] Command: /home/dsbard/yes/envs/prokka/bin/prokka --kingdom Bacteria busby_phage_123.fa --outdir /home/dsbard/bee_phage/dls/network/annot/busby_phage_123 --hmms /home/dsbard/phrogs/all_phrogs.hmm --prefix annotate_busby_phage_123
[14:52:04] Appending to PATH: /home/dsbard/yes/envs/prokka/bin
[14:52:04] Looking for 'aragorn' - found /home/dsbard/yes/envs/prokka/bin/aragorn
[14:52:04] Determined aragorn version is 001002 from 'ARAGORN v1.2.41 Dean Laslett'
[14:52:04] Looking for 'barrnap' - found /home/dsbard/yes/envs/prokka/bin/barrnap
[14:52:04] Determined barrnap version is 000009 from 'barrnap 0.9'
[14:52:04] Looking for 'blastp' - found /home/dsbard/yes/envs/prokka/bin/blastp
[14:52:04] Determined blastp version is 002014 from 'blastp: 2.14.0+'
[14:52:04] Looking for 'cmpress' - found /home/dsbard/yes/envs/prokka/bin/cmpress
[14:52:04] Determined cmpress version is 001001 from '# INFERNAL 1.1.4 (Dec 2020)'
[14:52:04] Looking for 'cmscan' - found /home/dsbard/yes/envs/prokka/bin/cmscan
[14:52:05] Determined cmscan version is 001001 from '# INFERNAL 1.1.4 (Dec 2020)'
[14:52:05] Looking for 'egrep' - found /usr/bin/egrep
[14:52:05] Looking for 'find' - found /usr/bin/find
[14:52:05] Looking for 'grep' - found /usr/bin/grep
[14:52:05] Looking for 'hmmpress' - found /home/dsbard/yes/envs/prokka/bin/hmmpress
[14:52:05] Determined hmmpress version is 003003 from '# HMMER 3.3.2 (Nov 2020); http://hmmer.org/'
[14:52:05] Looking for 'hmmscan' - found /home/dsbard/yes/envs/prokka/bin/hmmscan
[14:52:05] Determined hmmscan version is 003003 from '# HMMER 3.3.2 (Nov 2020); http://hmmer.org/'
[14:52:05] Looking for 'java' - found /home/dsbard/yes/envs/prokka/bin/java
[14:52:05] Looking for 'makeblastdb' - found /home/dsbard/yes/envs/prokka/bin/makeblastdb
[14:52:05] Determined makeblastdb version is 002014 from 'makeblastdb: 2.14.0+'
[14:52:05] Looking for 'minced' - found /home/dsbard/yes/envs/prokka/bin/minced
[14:52:05] Determined minced version is 004002 from 'minced 0.4.2'
[14:52:05] Looking for 'parallel' - found /home/dsbard/yes/envs/prokka/bin/parallel
[14:52:05] Determined parallel version is 20230522 from 'GNU parallel 20230522'
[14:52:05] Looking for 'prodigal' - found /home/dsbard/yes/envs/prokka/bin/prodigal
[14:52:05] Determined prodigal version is 002006 from 'Prodigal V2.6.3: February, 2016'
[14:52:05] Looking for 'prokka-genbank_to_fasta_db' - found /home/dsbard/yes/envs/prokka/bin/prokka-genbank_to_fasta_db
[14:52:05] Looking for 'sed' - found /usr/bin/sed
[14:52:05] Looking for 'tbl2asn' - found /home/dsbard/yes/envs/prokka/bin/tbl2asn
[14:52:05] Determined tbl2asn version is 025007 from 'tbl2asn 25.7 arguments:'
[14:52:05] Using genetic code table 11.
[14:52:05] Loading and checking input file: busby_phage_123.fa
[14:52:05] Wrote 1 contigs totalling 1857 bp.
[14:52:05] Predicting tRNAs and tmRNAs
[14:52:05] Running: aragorn -l -gc11 -w \/home\/dsbard\/bee_phage\/dls\/network\/annot\/busby_phage_123\/annotate_busby_phage_123\.fna
[14:52:06] Found 0 tRNAs
[14:52:06] Predicting Ribosomal RNAs
[14:52:06] Running Barrnap with 8 threads
[14:52:06] Found 0 rRNAs
[14:52:06] Skipping ncRNA search, enable with --rfam if desired.
[14:52:06] Total of 0 tRNA + rRNA features
[14:52:06] Searching for CRISPR repeats
[14:52:06] Found 0 CRISPRs
[14:52:06] Predicting coding sequences
[14:52:06] Contigs total 1857 bp, so using meta mode
[14:52:06] Running: prodigal -i \/home\/dsbard\/bee_phage\/dls\/network\/annot\/busby_phage_123\/annotate_busby_phage_123\.fna -c -m -g 11 -p meta -f sco -q
[14:52:06] Found 0 CDS
[14:52:06] Connecting features back to sequences
[14:52:06] Not using genus-specific database. Try --usegenus to enable it.
[14:52:06] Preparing user-supplied primary HMMER annotation source: /home/dsbard/phrogs/all_phrogs.hmm
[14:52:06] Using /inference source as 'all_phrogs'
[14:52:06] Annotating CDS, please be patient.
[14:52:06] Will use 8 CPUs for similarity searching.
[14:52:06] Found 0 unique /gene codes.
[14:52:06] Fixed 0 colliding /gene names.
[14:52:06] Adding /locus_tag identifiers
[14:52:06] Assigned 0 locus_tags to CDS and RNA features.
[14:52:06] Writing outputs to /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/
[14:52:06] Generating annotation statistics file
[14:52:06] Generating Genbank and Sequin files
[14:52:06] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14.6 from https://github.com/tseemann/prokka' -Z \/home\/dsbard\/bee_phage\/dls\/network\/annot\/busby_phage_123\/annotate_busby_phage_123\.err -i \/home\/dsbard\/bee_phage\/dls\/network\/annot\/busby_phage_123\/annotate_busby_phage_123\.fsa 2> /dev/null
[14:52:07] Deleting unwanted file: /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/errorsummary.val
[14:52:07] Deleting unwanted file: /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.dr
[14:52:07] Deleting unwanted file: /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.fixedproducts
[14:52:07] Deleting unwanted file: /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.ecn
[14:52:07] Deleting unwanted file: /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.val
[14:52:07] Repairing broken .GBK output that tbl2asn produces...
[14:52:07] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < \/home\/dsbard\/bee_phage\/dls\/network\/annot\/busby_phage_123\/annotate_busby_phage_123\.gbf > \/home\/dsbard\/bee_phage\/dls\/network\/annot\/busby_phage_123\/annotate_busby_phage_123\.gbk
[14:52:07] Deleting unwanted file: /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.gbf
[14:52:07] Output files:
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.ffn
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.tbl
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.faa
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.sqn
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.all_phrogs.hmm.tmp.588077.faa
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.IS.tmp.588077.faa
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.gbk
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.tsv
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.sprot.tmp.588077.faa
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.txt
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.fsa
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.AMR.tmp.588077.faa
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.fna
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.gff
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.err
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.HAMAP.hmm.tmp.588077.faa
[14:52:07] /home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123.log
[14:52:07] Annotation finished successfully.
[14:52:07] Walltime used: 0.05 minutes
[14:52:07] If you use this result please cite the Prokka paper:
[14:52:07] Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-9.
[14:52:07] Type 'prokka --citation' for more details.
[14:52:07] Share and enjoy!
Error file contents
Discrepancy Report Results
Summary
DISC_SOURCE_QUALS_ASNDISC:strain (all present, all unique)
DISC_SOURCE_QUALS_ASNDISC:taxname (all present, all unique)
DISC_COUNT_NUCLEOTIDES:1 nucleotide Bioseqs are present
NO_ANNOTATION:1 bioseqs have no features
DISC_QUALITY_SCORES:Quality scores are missing on all sequences.
ONCALLER_COMMENT_PRESENT:1 comment descriptors were found (all same)
MISSING_GENOMEASSEMBLY_COMMENTS:1 bioseqs are missing GenomeAssembly structured comments
MOLTYPE_NOT_MRNA:1 molecule types are not set as mRNA.
TECHNIQUE_NOT_TSA:1 technique are not set as TSA
MISSING_STRUCTURED_COMMENT:1 sequences do not include structured comments.
MISSING_PROJECT:1 sequences do not include project.
DISC_INCONSISTENT_MOLINFO_TECH:Molinfo Technique Report (some missing, all same)
Detailed Report
DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::strain (all present, all unique)
DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::1 sources have unique values for strain
DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::taxname (all present, all unique)
DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::1 sources have unique values for taxname
DiscRep_ALL:DISC_COUNT_NUCLEOTIDES::1 nucleotide Bioseqs are present
/home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123:busby_phage_123 (length 1857)
DiscRep_ALL:NO_ANNOTATION::1 bioseqs have no features
/home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123:busby_phage_123 (length 1857)
DiscRep_ALL:DISC_QUALITY_SCORES::Quality scores are missing on all sequences.
DiscRep_ALL:ONCALLER_COMMENT_PRESENT::1 comment descriptors were found (all same)
/home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123:busby_phage_123:Annotated using prokka 1.14.6 from https://github.com/tseemann/prokka
DiscRep_ALL:MISSING_GENOMEASSEMBLY_COMMENTS::1 bioseqs are missing GenomeAssembly structured comments
/home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123:busby_phage_123 (length 1857)
DiscRep_ALL:MOLTYPE_NOT_MRNA::1 molecule types are not set as mRNA.
/home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123:busby_phage_123 (length 1857)
DiscRep_ALL:TECHNIQUE_NOT_TSA::1 technique are not set as TSA
/home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123:busby_phage_123 (length 1857)
DiscRep_ALL:MISSING_STRUCTURED_COMMENT::1 sequences do not include structured comments.
/home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123:busby_phage_123 (length 1857)
DiscRep_ALL:MISSING_PROJECT::1 sequences do not include project.
/home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123:busby_phage_123 (length 1857)
DiscRep_ALL:DISC_INCONSISTENT_MOLINFO_TECH::Molinfo Technique Report (some missing, all same)
DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::technique (all missing)
DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::1 Molinfos are missing field technique
/home/dsbard/bee_phage/dls/network/annot/busby_phage_123/annotate_busby_phage_123:busby_phage_123 (length 1857)
Its my first time making one of these issues, so let me know if there is some other info I should have included. Any help would be greatly appreciated!
Thank you.
Hi,
I downloaded prokka using conda and am running it with a custom database (PHROGs: https://phrogs.lmge.uca.fr/) to annotate prokaryotic viruses. I am doing this on my university's HPC using a for loop. It seems to work fine for most (3930) of my files, but seems to be malfunctioning for a small (5) handful. Specifically, for this small handful, prokka runs through these files with no error, but creates empty outputs (.faa, etc.). It also never deletes several temporary files.
I've double checked the input .fasta files which lead to these empty outputs and they all look fine. Each is a standard .fasta (header starting with ">") which is readable, non-zero, and most definitely a file. They all also have one sequence per file and are fairly short (~5-10kb).
One of the problemed files:
busby_phage_123.fa
The loop I run:
The code portion of
annot.sh
Output dir:
Log file contents:
Error file contents
And finally, the contents of my conda env:
Its my first time making one of these issues, so let me know if there is some other info I should have included. Any help would be greatly appreciated! Thank you.