metagenome-atlas / atlas

ATLAS - Three commands to start analyzing your metagenome data
https://metagenome-atlas.github.io/
BSD 3-Clause "New" or "Revised" License
375 stars 98 forks source link

Error in pplacer #402

Closed animesh closed 3 years ago

animesh commented 3 years ago

Invocation looks like following:

(atlas) animeshs@DMED7596:~/ayu$ atlas run all
[2021-06-19 12:08 INFO] Executing: snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   all
Building DAG of jobs...
Updating job 81 (combine_egg_nogg_annotations).
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
        count   jobs
        1       align
        19      align_reads_to_MAGs
        1       all
        1       all_gtdb_trees
        1       all_prodigal
        19      bam_2_sam_MAGs
        1       build_db_genomes
        1       classify
        1       combine_bined_coverages_MAGs
        1       combine_coverages_MAGs
        19      convert_sam_to_bam
        1       first_dereplication
        1       gene2genome
        1       genomes
        1       identify
        19      pileup_MAGs
        1       rename_genomes
        1       run_all_checkm_lineage_wf
        1       second_dereplication
        91

[Sat Jun 19 12:09:43 2021]
rule first_dereplication:
    input: genomes/all_bins, genomes/quality.csv
    output: genomes/pre_dereplication/dereplicated_genomes
    log: logs/genomes/pre_dereplication.log
    jobid: 463
    resources: mem=160, time=12

Job counts:
        count   jobs
        1       first_dereplication
        1
[Sat Jun 19 12:09:46 2021]
Error in rule first_dereplication:
    jobid: 0
    output: genomes/pre_dereplication/dereplicated_genomes
    log: logs/genomes/pre_dereplication.log (check log file(s) for error message)
    conda-env: /mnt/z/ayu/databases/conda_envs/f4069a5d
    shell:
         rm -rf genomes/pre_dereplication ; dRep dereplicate    --genomes genomes/all_bins/*.fasta  --genomeInfo genomes/quality.csv  --length 5000  --completeness 50  --contamination 10  --SkipSecondary  --P_ani 0.95  --completeness_weight 1  --contamination_weight 5  --strain_heterogeneity_weight 1  --N50_weight 0.5  --size_weight 0  --MASH_sketch 5000  --processors 1    genomes/pre_dereplication  &> logs/genomes/pre_dereplication.log
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Exiting because a job execution failed. Look above for error message
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Note the path to the log file for debugging.
Documentation is available at: https://metagenome-atlas.readthedocs.io
Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues
Complete log: /mnt/z/ayu/.snakemake/log/2021-06-19T120839.643861.snakemake.log
[2021-06-19 12:09 CRITICAL] Command 'snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   all   ' returned non-zero exit status 1.

and logs/genomes/pre_dereplication.log says following:

Traceback (most recent call last):
  File "/mnt/z/ayu/databases/conda_envs/f4069a5d/bin/dRep", line 26, in <module>
    import drep.argumentParser
  File "/mnt/z/ayu/databases/conda_envs/f4069a5d/lib/python3.6/site-packages/drep/__init__.py", line 5, in <module>
    from Bio import SeqIO
  File "/mnt/z/ayu/databases/conda_envs/f4069a5d/lib/python3.6/site-packages/Bio/SeqIO/__init__.py", line 382, in <module>
    from Bio.Align import MultipleSeqAlignment
  File "/mnt/z/ayu/databases/conda_envs/f4069a5d/lib/python3.6/site-packages/Bio/Align/__init__.py", line 21, in <module>
    from Bio.Align import substitution_matrices
  File "/mnt/z/ayu/databases/conda_envs/f4069a5d/lib/python3.6/site-packages/Bio/Align/substitution_matrices/__init__.py", line 12, in <module>
    import numpy
ModuleNotFoundError: No module named 'numpy'
logs/genomes/pre_dereplication.log (END)

however

(atlas) animeshs@DMED7596:~/ayu$ pip install numpy
Requirement already satisfied: numpy in /home/animeshs/miniconda3/envs/atlas/lib/python3.6/site-packages (1.19.5)

so I am not sure what is the issue here?

SilasK commented 3 years ago

You finished the binning. I think this is a good point to update to the latest atlas version.

Alternatively you could

conda activate /mnt/z/ayu/databases/conda_envs/f4069a5d conda install -y numpy conda deactivate

And try again

Maybe rename the hidden folder '.snakemake' to be save.

Also Provided cores: 1 (use --cores to define parallelism)

animesh commented 3 years ago

Thanks for the suggestions @SilasK 👍🏽 I tried your alternative but it failed as following, should I just go ahead with update, if so, what will be the easiest way without losing all the work done?

(base) animeshs@DMED7596:~$ conda activate /mnt/z/ayu/databases/conda_envs/f4069a5d
(/mnt/z/ayu/databases/conda_envs/f4069a5d) animeshs@DMED7596:~$ conda install -y numpy
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /mnt/z/ayu/databases/conda_envs/f4069a5d

  added / updated specs:
    - numpy

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2021.5.30          |   py36h06a4308_0         139 KB
    numpy-1.19.2               |   py36h6163131_0          22 KB
    numpy-base-1.19.2          |   py36h75fe3a5_0         4.1 MB
    ------------------------------------------------------------
                                           Total:         4.3 MB

The following NEW packages will be INSTALLED:

  blas               pkgs/main/linux-64::blas-1.0-openblas
  numpy              pkgs/main/linux-64::numpy-1.19.2-py36h6163131_0
  numpy-base         pkgs/main/linux-64::numpy-base-1.19.2-py36h75fe3a5_0

The following packages will be UPDATED:

  ca-certificates    conda-forge::ca-certificates-2020.12.~ --> pkgs/main::ca-certificates-2021.5.25-h06a4308_1
  certifi            conda-forge::certifi-2020.12.5-py36h5~ --> pkgs/main::certifi-2021.5.30-py36h06a4308_0
  openssl            conda-forge::openssl-1.1.1i-h7f98852_0 --> pkgs/main::openssl-1.1.1k-h27cfd23_0

Downloading and Extracting Packages
numpy-base-1.19.2    | 4.1 MB    | ################################################################################################################################## | 100%
numpy-1.19.2         | 22 KB     | ################################################################################################################################## | 100%
certifi-2021.5.30    | 139 KB    | ################################################################################################################################## | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
ERROR conda.core.link:_execute(698): An error occurred while installing package 'defaults::numpy-base-1.19.2-py36h75fe3a5_0'.
Rolling back transaction: done

[Errno 2] No such file or directory: '/mnt/z/ayu/databases/conda_envs/f4069a5d/lib/python3.6/site-packages/numpy/__pycache__'
SilasK commented 3 years ago

In theory the pipeline should be no incompatibilities between different versions (untill major version update).

To be sure I suggest running.

atlas run binning

To finish all the steps before dereplication. And then update.

animesh commented 3 years ago

binning went fine it seems but update is probably not working as python setup.py install it still says

(atlas) animeshs@DMED7596:~/ayu$ atlas --version
atlas, version 2.4.4

assuming it went fine, reinvocation gave following error, any ideas whats up?

[2021-06-22 10:33 INFO] Executing: snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   all  --cores 8
Building DAG of jobs...
Updating job 81 (combine_egg_nogg_annotations).
Using shell: /bin/bash
Provided cores: 8
Rules claiming more threads will be scaled down.
Job counts:
        count   jobs
        1       align
        19      align_reads_to_MAGs
        1       all
        1       all_gtdb_trees
        1       all_prodigal
        19      bam_2_sam_MAGs
        1       build_db_genomes
        1       classify
        1       combine_bined_coverages_MAGs
        1       combine_coverages_MAGs
        19      convert_sam_to_bam
        1       first_dereplication
        1       gene2genome
        1       genomes
        1       identify
        19      pileup_MAGs
        1       rename_genomes
        1       run_all_checkm_lineage_wf
        1       second_dereplication
        91

[Tue Jun 22 10:34:49 2021]
rule first_dereplication:
    input: genomes/all_bins, genomes/quality.csv
    output: genomes/pre_dereplication/dereplicated_genomes
    log: logs/genomes/pre_dereplication.log
    jobid: 463
    threads: 8
    resources: mem=160, time=12

Job counts:
        count   jobs
        1       first_dereplication
        1
[Tue Jun 22 10:34:53 2021]
Error in rule first_dereplication:
    jobid: 0
    output: genomes/pre_dereplication/dereplicated_genomes
    log: logs/genomes/pre_dereplication.log (check log file(s) for error message)
    conda-env: /mnt/z/ayu/databases/conda_envs/f4069a5d
    shell:
         rm -rf genomes/pre_dereplication ; dRep dereplicate    --genomes genomes/all_bins/*.fasta  --genomeInfo genomes/quality.csv  --length 5000  --completeness 50  --contamination 10  --SkipSecondary  --P_ani 0.95  --completeness_weight 1  --contamination_weight 5  --strain_heterogeneity_weight 1  --N50_weight 0.5  --size_weight 0  --MASH_sketch 5000  --processors 8    genomes/pre_dereplication  &> logs/genomes/pre_dereplication.log
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Exiting because a job execution failed. Look above for error message
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Note the path to the log file for debugging.
Documentation is available at: https://metagenome-atlas.readthedocs.io
Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues
Complete log: /mnt/z/ayu/.snakemake/log/2021-06-22T103345.266546.snakemake.log
[2021-06-22 10:34 CRITICAL] Command 'snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   all  --cores 8 ' returned non-zero exit status 1.
SilasK commented 3 years ago

Can you try to update to v2.6a2

animesh commented 3 years ago

I tried cloning the repo and running setup but I still get same version name, so I am not sure how to go about it?

SilasK commented 3 years ago

I don't know. I have the v2.6a2 on conda, and GitHub. In the newest version I have updated the Dreplication. so this should circumvent/sove your problem.

Why not using mamba to install a new atlas version.

animesh commented 3 years ago

The situation (appended below) is the same even with mamba, so guess binary is not being replaced?

(atlas) animeshs@DMED7596:~/ayu/atlas$ mamba install metagenome-atlas

                  __    __    __    __
                 /  \  /  \  /  \  /  \
                /    \/    \/    \/    \
███████████████/  /██/  /██/  /██/  /████████████████████████
              /  / \   / \   / \   / \  \____
             /  /   \_/   \_/   \_/   \    o \__,
            / _/                       \_____/  `
            |/
        ███╗   ███╗ █████╗ ███╗   ███╗██████╗  █████╗
        ████╗ ████║██╔══██╗████╗ ████║██╔══██╗██╔══██╗
        ██╔████╔██║███████║██╔████╔██║██████╔╝███████║
        ██║╚██╔╝██║██╔══██║██║╚██╔╝██║██╔══██╗██╔══██║
        ██║ ╚═╝ ██║██║  ██║██║ ╚═╝ ██║██████╔╝██║  ██║
        ╚═╝     ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝╚═════╝ ╚═╝  ╚═╝

        mamba (0.7.14) supported by @QuantStack

        GitHub:  https://github.com/mamba-org/mamba
        Twitter: https://twitter.com/QuantStack

█████████████████████████████████████████████████████████████

Looking for: ['metagenome-atlas']

pkgs/main/linux-64       Using cache
pkgs/main/noarch         Using cache
pkgs/r/linux-64          Using cache
pkgs/r/noarch            Using cache
Transaction

  Prefix: /home/animeshs/miniconda3/envs/atlas

  All requested packages already installed

(atlas) animeshs@DMED7596:~/ayu/atlas$ atlas --version
atlas, version 2.4.4
SilasK commented 3 years ago

It is, you just need to tell mamba explicitly about mamba install metagenome-atlas=2.6a2or maybe mamba update

jmtsuji commented 3 years ago

@animesh Sometimes I find that you need to remove the entire conda env and then re-create it from scratch to get software to update to a higher version properly, e.g.,:

conda env remove -n atlas
conda create -y -n atlas -c conda-forge mamba python=3.7
conda activate atlas
mamba install -c bioconda -c conda-forge atlas=2.6a2
animesh commented 3 years ago

Thanks @SilasK @jmtsuji ,mamba install -c bioconda -c conda-forge metagenome-atlas=2.6a2 seems to have worked as

(atlas) animeshs@DMED7596:~/ayu/atlas$ atlas --version
atlas, version 2.6a2

but the run seems to be stuck at following for hour

[2021-06-23 10:21 INFO] Executing: snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.7/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   --scheduler greedy  all  --cores 8
localrules directive specifies rules that are not present in the Snakefile:
        verify_eggNOG_files

Building DAG of jobs...
Updating job combine_egg_nogg_annotations.
Creating conda environment /home/animeshs/miniconda3/envs/atlas/lib/python3.7/site-packages/atlas/rules/../envs/checkm.yaml...
Downloading and installing remote packages.
Environment for ../../../home/animeshs/miniconda3/envs/atlas/lib/python3.7/site-packages/atlas/envs/checkm.yaml created (location: databases/conda_envs/36b789d6b4ba8a7de1acdd08ea16a9b3)
Creating conda environment /home/animeshs/miniconda3/envs/atlas/lib/python3.7/site-packages/atlas/rules/../envs/report.yaml...
Downloading and installing remote packages.

is this normal?

SilasK commented 3 years ago

It's not uncommon that the report takes long to install, but I hoped it should be better with the new version. Do you have snakemake version >6.1?

animesh commented 3 years ago

Yes, it took a couple of hours and then progressed 👍🏽 although it crashed later just 3% short of completion:

Refining topology: 25 rounds ME-NNIs, 2 rounds ME-SPRs, 13 rounds ML-NNIs
Total branch-length 16.700 after 5.99 sec, 1 of 78 splits
ML-NNI round 1: LogLk = -422511.503 NNIs 3 max delta 52.41 Time 15.31
Switched to using 20 rate categories (CAT approximation)20 of 20
Rate categories were divided by 1.058 so that average rate = 1.0
CAT-based log-likelihoods may not be comparable across runs
Use -gamma for approximate but comparable Gamma(20) log-likelihoods
ML-NNI round 2: LogLk = -400196.696 NNIs 0 max delta 0.00 Time 22.27
Turning off heuristics for final round of ML NNIs (converged)
ML-NNI round 3: LogLk = -399877.303 NNIs 0 max delta 0.00 Time 29.33 (final)
Optimize all lengths: LogLk = -399875.379 Time 31.69
Total time: 37.59 seconds Unique: 80/80 Bad splits: 0/77
[Wed Jun 23 19:57:04 2021]
Finished job 1314.
134 of 139 steps (96%) done

[Wed Jun 23 19:57:04 2021]
localrule root_tree:
    input: genomes/tree/gtdbtk.bac120.unrooted.nwk
    output: genomes/tree/gtdbtk.bac120.nwk
    log: logs/genomes/tree/root_tree_gtdbtk.bac120.log
    jobid: 1313
    wildcards: msa=gtdbtk.bac120
    resources: tmpdir=/tmp, mem=160, time=12

Activating conda environment: /mnt/z/ayu/databases/conda_envs/0dac41a8
Activating conda environment: /mnt/z/ayu/databases/conda_envs/0dac41a8
Removing temporary output file genomes/tree/gtdbtk.bac120.unrooted.nwk.
[Wed Jun 23 19:57:17 2021]
Finished job 1313.
135 of 139 steps (97%) done

[Wed Jun 23 19:57:17 2021]
rule classify:
    input: genomes/taxonomy/gtdb/align, genomes/genomes
    output: genomes/taxonomy/gtdb/classify
    log: logs/taxonomy/gtdbtk/classify.txt, genomes/taxonomy/gtdb/gtdbtk.log
    jobid: 1201
    threads: 8
    resources: tmpdir=/tmp, mem=160, time=24

Activating conda environment: /mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de
[Thu Jun 24 00:21:26 2021]
Error in rule classify:
    jobid: 1201
    output: genomes/taxonomy/gtdb/classify
    log: logs/taxonomy/gtdbtk/classify.txt, genomes/taxonomy/gtdb/gtdbtk.log (check log file(s) for error message)
    conda-env: /mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de
    shell:
        GTDBTK_DATA_PATH=/mnt/z/ayu/databases/GTDB_V06 ; gtdbtk classify --genome_dir genomes/genomes --align_dir genomes/taxonomy/gtdb --out_dir genomes/taxonomy/gtdb --extension fasta --cpus 8 &> logs/taxonomy/gtdbtk/classify.txt
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job classify since they might be corrupted:
genomes/taxonomy/gtdb/classify
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Note the path to the log file for debugging.
Documentation is available at: https://metagenome-atlas.readthedocs.io
Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues
Complete log: /mnt/z/ayu/.snakemake/log/2021-06-23T102101.523504.snakemake.log
[2021-06-24 00:21 CRITICAL] Command 'snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.7/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   --scheduler greedy  all  --cores 8 ' returned non-zero exit status 1.

digging into the logs/taxonomy/gtdbtk/classify.txt classify.txt issue seems to be pplacer and it looks like there are multiple version of it (atlas) animeshs@DMED7596:~/ayu$ find . -iname "pplacer" ./databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de/bin/pplacer ./databases/conda_envs/36b789d6b4ba8a7de1acdd08ea16a9b3/bin/pplacer ./databases/conda_envs/4290e12d/bin/pplacer ./databases/conda_envs/d83cddba/bin/pplacer ./databases/GTDB_V05/pplacer ./databases/GTDB_V06/pplacer

could that be the issue? BTW the snakemake version is

./databases/GTDB_V06/pplacer (atlas) animeshs@DMED7596:~/ayu$ snakemake --version 6.5.0

SilasK commented 3 years ago

Is the genomes/taxonomy/gtdb/classify/intermediate_results/pplacer/pplacer.bac120.out available?

Do you have really the 160gb available to pplcaer. This tool often uses a lot of resources.

animesh commented 3 years ago

I have 128gb real and allowed 256gb as swap, can that be the issue? Also, cant find pplacer.bac120.out in the pwd, is it suppose to be somewhere else?

SilasK commented 3 years ago

I have 128gb real and allowed 256gb as swap, can that be the issue?

Yes, I think this could be the issue. If you could limit the large mem the config file to what you have is probably best.

animesh commented 3 years ago

I reduced it to ~60gb config.zip but it failed with following message

[2021-06-24 14:42 INFO] Executing: snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.7/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   --scheduler greedy  all  --cores 12
localrules directive specifies rules that are not present in the Snakefile:
        verify_eggNOG_files

Building DAG of jobs...
Updating job build_db_genomes.
Updating job combine_bined_coverages_MAGs.
Updating job combine_coverages_MAGs.
Updating job run_all_checkm_lineage_wf.
Updating job identify.
Updating job classify.
Updating job all_prodigal.
Updating job genomes.
Updating job gene2genome.
Updating job all_gtdb_trees.
Updating job classify.
Updating job combine_egg_nogg_annotations.
Using shell: /bin/bash
Provided cores: 12
Rules claiming more threads will be scaled down.
Singularity containers: ignored
Job stats:
job               count    min threads    max threads
--------------  -------  -------------  -------------
all                   1              1              1
all_gtdb_trees        1              1              1
classify              1             12             12
genomes               1              1              1
total                 4              1             12

[Thu Jun 24 14:42:30 2021]
rule classify:
    input: genomes/taxonomy/gtdb/align, genomes/genomes
    output: genomes/taxonomy/gtdb/classify
    log: logs/taxonomy/gtdbtk/classify.txt, genomes/taxonomy/gtdb/gtdbtk.log
    jobid: 1201
    threads: 12
    resources: tmpdir=/tmp, mem=60, time=24

Activating conda environment: /mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de
[Thu Jun 24 19:00:49 2021]
Error in rule classify:
    jobid: 1201
    output: genomes/taxonomy/gtdb/classify
    log: logs/taxonomy/gtdbtk/classify.txt, genomes/taxonomy/gtdb/gtdbtk.log (check log file(s) for error message)
    conda-env: /mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de
    shell:
        GTDBTK_DATA_PATH=/mnt/z/ayu/databases/GTDB_V06 ; gtdbtk classify --genome_dir genomes/genomes --align_dir genomes/taxonomy/gtdb --out_dir genomes/taxonomy/gtdb --extension fasta --cpus 12 &> logs/taxonomy/gtdbtk/classify.txt
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job classify since they might be corrupted:
genomes/taxonomy/gtdb/classify
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Note the path to the log file for debugging.
Documentation is available at: https://metagenome-atlas.readthedocs.io
Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues
Complete log: /mnt/z/ayu/.snakemake/log/2021-06-24T144202.958779.snakemake.log
[2021-06-24 19:01 CRITICAL] Command 'snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.7/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   --scheduler greedy  all  --cores 12 ' returned non-zero exit status 1.

looking at classify.txt says genomes/taxonomy/gtdb/classify/intermediate_results/pplacer/pplacer.bac120.json has no placements! and gtdbtk.log complains about RAM... any ideas to get out of this catch-22 like situation?

SilasK commented 3 years ago

I assume this is the error

WARNING: pplacer requires ~204 GB of RAM to fully load the bacterial tree into memory.

@animesh You need to set the memory >210gb

large_mem: 250

and threads <=8

Just check that in the genomes/taxonomy/gtdb/align everithing is ok. e.g. that there are genomes to be placed.

@zztin

zztin commented 3 years ago

Where do I add the parameter large_mem: ?

On Fri, Jun 25, 2021 at 11:22 Silas Kieser @.***> wrote:

I assume this is the error

WARNING: pplacer requires ~204 GB of RAM to fully load the bacterial tree into memory.

@animesh https://github.com/animesh You need to set the memory >210gb

large_mem: 250

and threads <=8

Just check that in the genomes/taxonomy/gtdb/align everithing is ok. e.g. that there are genomes to be placed.

@zztin https://github.com/zztin

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/metagenome-atlas/atlas/issues/402#issuecomment-868362427, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH2QX2MTCRPBPFX5IMASTYLTURDEBANCNFSM4666VG2Q .

SilasK commented 3 years ago

In the atlas config file in the working dir

animesh commented 3 years ago

I have the following in the align folder, does it look fine?

(atlas) animeshs@DMED7596:~/ayu$ ls -ltrh genomes/taxonomy/gtdb/align/
total 226M
drwxrwxrwx 1 animeshs animeshs 4.0K Jun 23 19:53 intermediate_results
-rwxrwxrwx 1 animeshs animeshs    0 Jun 23 19:56 gtdbtk.bac120.filtered.tsv
-rwxrwxrwx 1 animeshs animeshs 226M Jun 23 19:56 gtdbtk.bac120.msa.fasta
-rwxrwxrwx 1 animeshs animeshs 395K Jun 23 19:56 gtdbtk.bac120.user_msa.fasta

And I tried with large-mem

[2021-06-25 11:27 INFO] Executing: snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.7/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   --scheduler greedy  all  --cores 8
localrules directive specifies rules that are not present in the Snakefile:
        verify_eggNOG_files

Building DAG of jobs...
Updating job build_db_genomes.
Updating job combine_bined_coverages_MAGs.
Updating job combine_coverages_MAGs.
Updating job run_all_checkm_lineage_wf.
Updating job identify.
Updating job classify.
Updating job all_prodigal.
Updating job genomes.
Updating job gene2genome.
Updating job all_gtdb_trees.
Updating job classify.
Updating job combine_egg_nogg_annotations.
Using shell: /bin/bash
Provided cores: 8
Rules claiming more threads will be scaled down.
Singularity containers: ignored
Job stats:
job               count    min threads    max threads
--------------  -------  -------------  -------------
all                   1              1              1
all_gtdb_trees        1              1              1
classify              1              8              8
genomes               1              1              1
total                 4              1              8

[Fri Jun 25 11:28:05 2021]
rule classify:
    input: genomes/taxonomy/gtdb/align, genomes/genomes
    output: genomes/taxonomy/gtdb/classify
    log: logs/taxonomy/gtdbtk/classify.txt, genomes/taxonomy/gtdb/gtdbtk.log
    jobid: 1201
    threads: 8
    resources: tmpdir=/tmp, mem=250, time=24

Activating conda environment: /mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de
[Fri Jun 25 16:13:01 2021]
Error in rule classify:
    jobid: 1201
    output: genomes/taxonomy/gtdb/classify
    log: logs/taxonomy/gtdbtk/classify.txt, genomes/taxonomy/gtdb/gtdbtk.log (check log file(s) for error message)
    conda-env: /mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de
    shell:
        GTDBTK_DATA_PATH=/mnt/z/ayu/databases/GTDB_V06 ; gtdbtk classify --genome_dir genomes/genomes --align_dir genomes/taxonomy/gtdb --out_dir genomes/taxonomy/gtdb --extension fasta --cpus 8 &> logs/taxonomy/gtdbtk/classify.txt
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job classify since they might be corrupted:
genomes/taxonomy/gtdb/classify
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Note the path to the log file for debugging.
Documentation is available at: https://metagenome-atlas.readthedocs.io
Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues
Complete log: /mnt/z/ayu/.snakemake/log/2021-06-25T112738.602238.snakemake.log
[2021-06-25 16:13 CRITICAL] Command 'snakemake --snakefile /home/animeshs/miniconda3/envs/atlas/lib/python3.7/site-packages/atlas/Snakefile --directory /mnt/z/ayu  --rerun-incomplete --configfile '/mnt/z/ayu/config.yaml' --nolock   --use-conda --conda-prefix /mnt/z/ayu/databases/conda_envs   --scheduler greedy  all  --cores 8 ' returned non-zero exit status 1.

but the error remains

[2021-06-25 11:28:12] INFO: GTDB-Tk v1.5.0
[2021-06-25 11:28:12] INFO: gtdbtk classify --genome_dir genomes/genomes --align_dir genomes/taxonomy/gtdb --out_dir genomes/taxonomy/gtdb --extension fasta --cpus 8
[2021-06-25 11:28:12] INFO: Using GTDB-Tk reference data version r202: /mnt/z/ayu/databases/GTDB_V06
[2021-06-25 11:28:14] WARNING: pplacer requires ~204 GB of RAM to fully load the bacterial tree into memory. However, 65.86 GB was detected. This may affect pplacer performance, or fail if there is insufficient swap space.
[2021-06-25 11:28:14] TASK: Placing 80 bacterial genomes into reference tree with pplacer using 8 CPUs (be patient).
[2021-06-25 11:28:14] INFO: pplacer version: v1.1.alpha19-0-g807f6f3
[2021-06-25 16:12:53] ERROR: An error was encountered while running tog.
[2021-06-25 16:12:53] ERROR: Controlled exit resulting from an unrecoverable error or warning.

================================================================================
EXCEPTION: TogException
  MESSAGE: b'Uncaught exception: Failure("genomes/taxonomy/gtdb/classify/intermediate_results/pplacer/pplacer.bac120.json has no placements!")\nFatal error: exception Failure("genomes/taxonomy/gtdb/classify/intermediate_results/pplacer/pplacer.bac120.json has no placements!")\n'
________________________________________________________________________________

Traceback (most recent call last):
  File "/mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de/lib/python3.8/site-packages/gtdbtk/__main__.py", line 95, in main
    gt_parser.parse_options(args)
  File "/mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de/lib/python3.8/site-packages/gtdbtk/main.py", line 735, in parse_options
    self.classify(options)
  File "/mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de/lib/python3.8/site-packages/gtdbtk/main.py", line 440, in classify
    classify.run(genomes,
  File "/mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de/lib/python3.8/site-packages/gtdbtk/classify.py", line 444, in run
    classify_tree = self.place_genomes(user_msa_file,
  File "/mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de/lib/python3.8/site-packages/gtdbtk/classify.py", line 261, in place_genomes
    pplacer.tog(pplacer_json_out, tree_file)
  File "/mnt/z/ayu/databases/conda_envs/2bbacb1a5eea0785a80f07e0a09d94de/lib/python3.8/site-packages/gtdbtk/external/pplacer.py", line 235, in tog
    raise TogException(proc_err)
gtdbtk.exceptions.TogException: b'Uncaught exception: Failure("genomes/taxonomy/gtdb/classify/intermediate_results/pplacer/pplacer.bac120.json has no placements!")\nFatal error: exception Failure("genomes/taxonomy/gtdb/classify/intermediate_results/pplacer/pplacer.bac120.json has no placements!")\n'
================================================================================
genomes/taxonomy/gtdb/gtdbtk.log (END)

but the swap seems to be sufficient?

(atlas) animeshs@DMED7596:~/ayu$ free
              total        used        free      shared  buff/cache   available
Mem:       65863788      291716    65473196          32       98876    65066708
Swap:     201326592        5280   201321312

so I guess this is not an atlas issue perse, wondering if there a way to make pplacer use this then?

SilasK commented 3 years ago

Prpbably pplacer uses even more memory.

See also: https://ecogenomics.github.io/GTDBTk/faq.html I think pplacer uses 200 for loading the graph + ~ 150* threads memory

I managed to run the example data with 3 genomes and pplacer used only 1.

Do I understand it correctly, you are trying to use a tool that needy >250gb on a machine with 60gb? And most resources is in swap space?

Don't you have a cluster node with more memory? Do you really want to run the gtdb? Now you are almost there. But just remember you could deactivate this annotation.

I would try to decrease the number of threads to one or two.

Note to myself:

animesh commented 3 years ago

I tried with 350 but yes, mostly on swap i.e. 216gb as I don't have anymore physical RAM to go further :( This failed even with --cores=1 so I guess I need to move to a machine with at least 350 physical RAM? Wondering what will be the way to move this analysis further using an HPC?

SilasK commented 3 years ago

Good news Atlas is designed to run on a HPC!! https://metagenome-atlas.readthedocs.io/en/latest/usage/getting_started.html#cluster-execution

animesh commented 3 years ago

Great, so can i just move this folder and invoke the "run all" command and it should start from where is crashed locally or there are some more tricks to be aware of?

SilasK commented 3 years ago

You already installed the cluster wrapper as described in the docs? Which HPC system do you have? Do you have different partition/queue names e.g. one for big memory jobs?

Yes, you can start from where you left off. If you need to copy, copy also the hidden folder .snakemake it's not necessary but probably better.

It should be quite easy, but HPC systems have always some surprises ready.