I followed the steps in the documentation for initializing a database, create config file and start running atlas. Most jobs passed but rule classify failed. I tried multiple times.
The slurm resource I gave: srun --job-name "smk-atlas" --cpus-per-task 16 --gres=tmpspace:500G --mem-per-cpu 6000 --time 24:00:00 --pty bash
Can you help me check what possibly went wrong?
snakemake main command:
atlas run all --cores 8
[2021-06-25 00:22 INFO] Executing: snakemake --snakefile /hpc/compgen/users/lchen/mambaforge/envs/atlas_env/lib/python3.7/site-packages/atlas/Snakefile --directory /hpc/compgen/projects/proj_pathogen_cfDNA/atlas --rerun-incomplete --configfile '/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/config.yaml' --nolock --use-conda --conda-prefix /hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs --scheduler greedy all --cores 8
Building DAG of jobs...
Updating job build_db_genomes.
Updating job combine_bined_coverages_MAGs.
Updating job combine_coverages_MAGs.
Updating job run_all_checkm_lineage_wf.
Updating job identify.
Updating job classify.
Updating job all_prodigal.
Updating job genomes.
Updating job gene2genome.
Updating job all_gtdb_trees.
Updating job classify.
Updating job combine_egg_nogg_annotations.
Using shell: /usr/bin/bash
Provided cores: 8
Rules claiming more threads will be scaled down.
Singularity containers: ignored
Job stats:
job count min threads max threads
-------- ------- ------------- -------------
all 1 1 1
classify 1 8 8
genomes 1 1 1
total 3 1 8
[Fri Jun 25 00:22:50 2021]
rule classify:
input: genomes/taxonomy/gtdb/align, genomes/genomes
output: genomes/taxonomy/gtdb/classify
log: logs/taxonomy/gtdbtk/classify.txt, genomes/taxonomy/gtdb/gtdbtk.log
jobid: 130
threads: 8
resources: tmpdir=/scratch/7943009, mem=100, time=24
Activating conda environment: /hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566
[Fri Jun 25 00:43:04 2021]
Error in rule classify:
jobid: 130
output: genomes/taxonomy/gtdb/classify
log: logs/taxonomy/gtdbtk/classify.txt, genomes/taxonomy/gtdb/gtdbtk.log (check log file(s) for error message)
conda-env: /hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566
shell:
GTDBTK_DATA_PATH=/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/GTDB_V06 ; gtdbtk classify --genome_dir genomes/genomes --align_dir genomes/taxonomy/gtdb --out_dir genomes/taxonomy/gtdb --extension fasta --cpus 8 &> logs/taxonomy/gtdbtk/classify.txt
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job classify since they might be corrupted:
genomes/taxonomy/gtdb/classify
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Note the path to the log file for debugging.
Documentation is available at: https://metagenome-atlas.readthedocs.io
Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues
Complete log: /hpc/compgen/projects/proj_pathogen_cfDNA/atlas/.snakemake/log/2021-06-25T002240.140877.snakemake.log
[2021-06-25 00:43 CRITICAL] Command 'snakemake --snakefile /hpc/compgen/users/lchen/mambaforge/envs/atlas_env/lib/python3.7/site-packages/atlas/Snakefile --directory /hpc/compgen/projects/proj_pathogen_cfDNA/atlas --rerun-incomplete --configfile '/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/config.yaml' --nolock --use-conda --conda-prefix /hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs --scheduler greedy all --cores 8 ' returned non-zero exit status 1.
logs/taxonomy/gtdbtk/classify.txt
Traceback (most recent call last):
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/external/pplacer.py", line 124, in _worker
raise PplacerException('An error was encountered while '
gtdbtk.exceptions.PplacerException: An error was encountered while running pplacer, check the log file: genomes/taxonomy/gtdb/classify/intermediate_results/pplacer/pplacer.bac120.out
^M ^M[2021-06-25 00:43:02] ERROR: Controlled exit resulting from an unrecoverable error or warning.
================================================================================
EXCEPTION: PplacerException
MESSAGE: An error was encountered while running pplacer.
Traceback (most recent call last):
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/__main__.py", line 95, in main
gt_parser.parse_options(args)
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/main.py", line 735, in parse_options
self.classify(options)
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/main.py", line 440, in classify
classify.run(genomes,
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/classify.py", line 444, in run
classify_tree = self.place_genomes(user_msa_file,
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/classify.py", line 240, in place_genomes
pplacer.run(self.pplacer_cpus, 'wag', pplacer_ref_pkg, pplacer_json_out,
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/external/pplacer.py", line 92, in run
raise PplacerException(
gtdbtk.exceptions.PplacerException: An error was encountered while running pplacer.
================================================================================
genomes/taxonomy/gtdb/gtdbtk.log
[2021-06-25 00:23:01] INFO: GTDB-Tk v1.5.0
[2021-06-25 00:23:01] INFO: gtdbtk classify --genome_dir genomes/genomes --align_dir genomes/taxonomy/gtdb --out_dir genomes/taxonomy/gtdb --extension fasta --cpus 8
[2021-06-25 00:23:01] INFO: Using GTDB-Tk reference data version r202: /hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/GTDB_V06
[2021-06-25 00:23:01] WARNING: pplacer requires ~204 GB of RAM to fully load the bacterial tree into memory. However, 131.81 GB was detected. This may affect pplacer performance, or fail if there is insufficient swap space.
[2021-06-25 00:23:01] TASK: Placing 3 bacterial genomes into reference tree with pplacer using 8 CPUs (be patient).
[2021-06-25 00:23:01] INFO: pplacer version: v1.1.alpha19-0-g807f6f3
[2021-06-25 00:43:02] ERROR: Controlled exit resulting from an unrecoverable error or warning.
================================================================================
EXCEPTION: PplacerException
MESSAGE: An error was encountered while running pplacer.
________________________________________________________________________________
Traceback (most recent call last):
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/__main__.py", line 95, in main
gt_parser.parse_options(args)
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/main.py", line 735, in parse_options
self.classify(options)
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/main.py", line 440, in classify
classify.run(genomes,
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/classify.py", line 444, in run
classify_tree = self.place_genomes(user_msa_file,
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/classify.py", line 240, in place_genomes
pplacer.run(self.pplacer_cpus, 'wag', pplacer_ref_pkg, pplacer_json_out,
File "/hpc/compgen/projects/proj_pathogen_cfDNA/atlas/databases/conda_envs/b63cf6a8393c12a56f10f74648452566/lib/python3.8/site-packages/gtdbtk/external/pplacer.py", line 92, in run
raise PplacerException(
gtdbtk.exceptions.PplacerException: An error was encountered while running pplacer.
================================================================================
Hi,
I followed the steps in the documentation for initializing a database, create config file and start running atlas. Most jobs passed but rule classify failed. I tried multiple times. The slurm resource I gave: srun --job-name "smk-atlas" --cpus-per-task 16 --gres=tmpspace:500G --mem-per-cpu 6000 --time 24:00:00 --pty bash
Can you help me check what possibly went wrong?
snakemake main command:
atlas run all --cores 8
logs/taxonomy/gtdbtk/classify.txt
genomes/taxonomy/gtdb/gtdbtk.log