Ecogenomics / GTDBTk

GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
https://ecogenomics.github.io/GTDBTk/
GNU General Public License v3.0
479 stars 82 forks source link

GTDB-Tk 2.4.0: Prodigal returned a non-zero exit code #600

Closed Nura7177 closed 2 months ago

Nura7177 commented 3 months ago

Hi, @pchaumeil, Pierre,

while running --identify on 1800 bacteria complete genomes (test script on one genome worked well and did not generate this error), got the error below, markers_summary.tsv were not generated and, thus, we cannot proceed further with --align:

Traceback (most recent call last): File "/trinity/home/anaconda3/envs/gtdbtk-2.4.0/lib/python3.8/site-packages/gtdbtk/external/prodigal.py", line 221, in run raise ProdigalException('Prodigal returned a non-zero exit code.') gtdbtk.exceptions.ProdigalException: Prodigal returned a non-zero exit code.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/trinity/home/anaconda3/envs/gtdbtk-2.4.0/lib/python3.8/site-packages/gtdbtk/main.py", line 102, in main gt_parser.parse_options(args) File "/trinity/home/anaconda3/envs/gtdbtk-2.4.0/lib/python3.8/site-packages/gtdbtk/main.py", line 1188, in parse_options self.identify(options) File "/trinity/home/anaconda3/envs/gtdbtk-2.4.0/lib/python3.8/site-packages/gtdbtk/main.py", line 319, in identify reports = markers.identify(genomes, File "/trinity/home/anaconda3/envs/gtdbtk-2.4.0/lib/python3.8/site-packages/gtdbtk/markers.py", line 205, in identify genome_dictionary = prodigal.run(genomes, tln_tables) File "/trinity/home/anaconda3/envs/gtdbtk-2.4.0/lib/python3.8/site-packages/gtdbtk/external/prodigal.py", line 230, in run raise ProdigalException(f'An exception was caught while running Prodigal: {e}') gtdbtk.exceptions.ProdigalException: An exception was caught while running Prodigal: Prodigal returned a non-zero exit code.

gtdbtk.ar53.markers_summary.tsv gtdbtk.bac120.markers_summary.tsv gtdbtk.failed_genomes.tsv gtdbtk.translation_table_summary.tsv

were not generated

Environment

Debugging information

GTDBtk_2.4.0_log.txt

Additional comments

all the genomes were processed and directories created: genomes_gtdbtk/ident_out_2/identify/intermediate_results/marker_genes

genomes_gtdbtk/ident_out_2/identify/intermediate_results/marker_genes/GCA_900637975.1_53550_F01_genomic]$ls GCA_900637975.1_53550_F01_genomic_protein.faa GCA_900637975.1_53550_F01_genomic_protein.fna GCA_900637975.1_53550_F01_genomic_protein.gff prodigal_translation_table.tsv GCA_900637975.1_53550_F01_genomic_protein.faa.sha256 GCA_900637975.1_53550_F01_genomic_protein.fna.sha256 GCA_900637975.1_53550_F01_genomic_protein.gff.sha256 prodigal_translation_table.tsv.sha256

Nura7177 commented 3 months ago

I split 1900 into two subsets and ran --identify on 900+ genomes. It worked will and did not generate this error. However, now I have two directories and two sets of marker genes gtdbtk.bac120.markers_summary.tsv in each of them. How and at which stage is it possible to combine them to build them all in one tree?