Ecogenomics / GTDBTk

GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
https://ecogenomics.github.io/GTDBTk/
GNU General Public License v3.0
476 stars 82 forks source link

Fail at gene calling #32

Closed pyilmaz closed 6 years ago

pyilmaz commented 6 years ago

Hello Pierre, I most likely have an issue with my installation, but just wanted to check if there is something else happening. I'd appreciate any ideas you might have, many thanks! My run fails at prodigal, with the following log:

[2018-08-30 22:54:59] INFO: Running Prodigal to identify genes. Process Process-3:1:1: Traceback (most recent call last): File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap self.run() File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, self._kwargs) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/biolib/parallel.py", line 107, in producer rtn = producer_callback(dataItem) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/biolib/external/prodigal.py", line 122, in _producer prodigalParser = ProdigalGeneFeatureParser(gff_file_tmp) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/biolib/external/prodigal.py", line 285, in init self.parseGFF(filename) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/biolib/external/prodigal.py", line 302, in __parseGFF self.translationTable = line.split(';')[4] IndexError: list index out of range Process Process-3: Traceback (most recent call last): File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap self.run() File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, *self._kwargs) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/gtdbtk/external/prodigal.py", line 100, in _worker rtn_files = self._run_prodigal(genome_id, file_path) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/gtdbtk/external/prodigal.py", line 64, in _run_prodigal summary_stats = summary_stats[summary_stats.keys()[0]] AttributeError: 'NoneType' object has no attribute 'keys' Process Process-2:1:1: Traceback (most recent call last): File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap self.run() File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(self._args, self._kwargs) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/biolib/parallel.py", line 107, in producer rtn = producer_callback(dataItem) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/biolib/external/prodigal.py", line 122, in _producer prodigalParser = ProdigalGeneFeatureParser(gff_file_tmp) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/biolib/external/prodigal.py", line 285, in init self.parseGFF(filename) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/biolib/external/prodigal.py", line 302, in __parseGFF self.translationTable = line.split(';')[4] IndexError: list index out of range Process Process-2: Traceback (most recent call last): File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap self.run() File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/gtdbtk/external/prodigal.py", line 100, in _worker rtn_files = self._run_prodigal(genome_id, file_path) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/gtdbtk/external/prodigal.py", line 64, in _run_prodigal summary_stats = summary_stats[summary_stats.keys()[0]] AttributeError: 'NoneType' object has no attribute 'keys'

[2018-08-30 22:55:13] INFO: Identifying TIGRFAM protein families. [2018-08-30 22:55:13] ERROR: integer division or modulo by zero ('Unexpected error:', <type 'exceptions.ZeroDivisionError'>) Traceback (most recent call last): File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/bin/gtdbtk", line 362, in gt_parser.parse_options(args) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/gtdbtk/main.py", line 319, in parse_options self.identify(options) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/gtdbtk/main.py", line 161, in identify options.prefix) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/gtdbtk/markers.py", line 232, in identify tigr_search.run(gene_files) File "/bioinf/home/pyilmaz/miniconda3/envs/gtdb_toolkit/lib/python2.7/site-packages/gtdbtk/external/tigrfam_search.py", line 152, in run self.cpus_per_genome = max(1, self.threads / len(gene_files)) ZeroDivisionError: integer division or modulo by zero

pchaumeil commented 6 years ago

Hello Pelin,

Could you please send me the input data you are using and the command line you are trying to run?

Thanks, Pierre

pyilmaz commented 6 years ago

Hello Pierre, I seemed to have solved the issue, I think there was some problem with perl (or which perl was being used). Another thing showed up at a later stage though:

FastANI has stopped: ERROR, skch::validateInputFiles, Could not open /bioinf/home/pyilmaz/projects/dbs/gtdb-tk/release86/fastani/database/GCF_000733935.1_genomic.fna

Which was easy to fix by adding "genomic.fna.gz" to the config.py

pchaumeil commented 6 years ago

Hi Pelin,

Glad to see you made it work! The latest config_template should have been set to use genomic.fna.gz instead of genomic.fna so I suspect you have still have the previous version of the config file.

Could you please check that the RED dictionaries in your config file are set to :

RED_DIST_BAC_DICT = {"d": 0.00, "p": 0.318988232537, "c": 0.473592766536, "o": 0.631939494307, "f": 0.776495919093, "g": 0.940062093841} RED_DIST_ARC_DICT = {"d": 0.00, "p": 0.219745045527, "c": 0.348416500728, "o": 0.51768679785, "f": 0.714706411191, "g": 0.909385795176}

Those value should reflect the rank RED values generated for release86.

Cheers, Pierre

pyilmaz commented 6 years ago

Thanks again Pierre, I've updated the config now also, and all is well. And thanks for this super useful tool!