rhysnewell / aviary

A hybrid assembly and MAG recovery pipeline (and more!)
GNU General Public License v3.0
76 stars 11 forks source link

error using aviary configure #165

Closed rikander closed 8 months ago

rikander commented 8 months ago

Hi there! Thanks so much for putting this pipeline together. My student and I are trying to get started with aviary configure, but we are running into errors that we can't parse/solve:

Here is the command: (aviary) [randerson@deming max]$ aviary configure --conda-prefix /usr/local/miniconda3/ --gtdb-path /researchdrive/data/databases/release214/ --checkm2-db-path /researchdrive/data/databases/CHECKM2DB/ --eggnog-db-path /researchdrive/data/databases/EGGNOG_DATA_DIR/ --download

And here is the output:

10/26/2023 01:40:39 PM INFO: Time - 13:40:39 26-10-2023
10/26/2023 01:40:39 PM INFO: Command - /usr/local/miniconda3/bin/aviary configure --conda-prefix /usr/local/miniconda3/ --gtdb-path /researchdrive/data/databases/release214/ --checkm2-db-path /researchdrive/data/databases/CHECKM2DB/ --eggnog-db-path /researchdrive/data/databases/EGGNOG_DATA_DIR/ --download
10/26/2023 01:40:39 PM INFO: Version - 0.8.2
10/26/2023 01:40:39 PM INFO: The current aviary environment variables are:
10/26/2023 01:40:39 PM INFO: CONDA_ENV_PATH: /usr/local/miniconda3/
10/26/2023 01:40:39 PM INFO: TMPDIR: /tmp
10/26/2023 01:40:39 PM INFO: GTDBTK_DATA_PATH: /researchdrive/data/databases/release214/
10/26/2023 01:40:39 PM INFO: EGGNOG_DATA_DIR: /researchdrive/data/databases/EGGNOG_DATA_DIR/
10/26/2023 01:40:39 PM INFO: CHECKM2DB: /researchdrive/data/databases/CHECKM2DB/
10/26/2023 01:40:39 PM INFO: Configuration file written to /Accounts/randerson/testing/max/config.yaml
10/26/2023 01:40:39 PM INFO: Executing: snakemake --snakefile /usr/local/miniconda3/lib/python3.9/site-packages/aviary/modules/Snakefile --directory /Accounts/randerson/testing/max --cores 16 --rerun-incomplete --keep-going  --rerun-triggers mtime --configfile /Accounts/randerson/testing/max/config.yaml --nolock  --retries 0 --conda-frontend mamba --resources mem_mb=256000   --use-conda --conda-prefix /usr/local/miniconda3/   download_databases
usage: snakemake [-h] [--dry-run] [--profile PROFILE] [--cache [RULE ...]] [--snakefile FILE] [--cores [N]] [--local-cores N] [--resources [NAME=INT ...]]
                 [--set-threads RULE=THREADS [RULE=THREADS ...]] [--set-scatter NAME=SCATTERITEMS [NAME=SCATTERITEMS ...]] [--default-resources [NAME=INT ...]]
                 [--preemption-default PREEMPTION_DEFAULT] [--preemptible-rules PREEMPTIBLE_RULES [PREEMPTIBLE_RULES ...]] [--config [KEY=VALUE ...]] [--configfile FILE [FILE ...]]
                 [--envvars VARNAME [VARNAME ...]] [--directory DIR] [--touch] [--keep-going] [--force] [--forceall] [--forcerun [TARGET ...]] [--prioritize TARGET [TARGET ...]]
                 [--batch RULE=BATCH/BATCHES] [--until TARGET [TARGET ...]] [--omit-from TARGET [TARGET ...]] [--rerun-incomplete] [--shadow-prefix DIR] [--scheduler [{ilp,greedy}]]
                 [--wms-monitor [WMS_MONITOR]] [--wms-monitor-arg [NAME=VALUE ...]] [--scheduler-ilp-solver {COIN_CMD}] [--no-subworkflows] [--groups GROUPS [GROUPS ...]]
                 [--group-components GROUP_COMPONENTS [GROUP_COMPONENTS ...]] [--report [FILE]] [--report-stylesheet CSSFILE] [--edit-notebook TARGET] [--notebook-listen IP:PORT]
                 [--lint [{text,json}]] [--generate-unit-tests [TESTPATH]] [--containerize] [--export-cwl FILE] [--list] [--list-target-rules] [--dag] [--rulegraph] [--filegraph] [--d3dag]
                 [--summary] [--detailed-summary] [--archive FILE] [--cleanup-metadata FILE [FILE ...]] [--cleanup-shadow] [--skip-script-cleanup] [--unlock] [--list-version-changes]
                 [--list-code-changes] [--list-input-changes] [--list-params-changes] [--list-untracked] [--delete-all-output] [--delete-temp-output] [--bash-completion] [--keep-incomplete]
                 [--drop-metadata] [--version] [--reason] [--gui [PORT]] [--printshellcmds] [--debug-dag] [--stats FILE] [--nocolor] [--quiet] [--print-compilation] [--verbose]
                 [--force-use-threads] [--allow-ambiguity] [--nolock] [--ignore-incomplete] [--max-inventory-time SECONDS] [--latency-wait SECONDS] [--wait-for-files [FILE ...]] [--notemp]
                 [--keep-remote] [--keep-target-files] [--allowed-rules ALLOWED_RULES [ALLOWED_RULES ...]] [--max-jobs-per-second MAX_JOBS_PER_SECOND]
                 [--max-status-checks-per-second MAX_STATUS_CHECKS_PER_SECOND] [-T RESTART_TIMES] [--attempt ATTEMPT] [--wrapper-prefix WRAPPER_PREFIX]
                 [--default-remote-provider {S3,GS,FTP,SFTP,S3Mocked,gfal,gridftp,iRODS,AzBlob}] [--default-remote-prefix DEFAULT_REMOTE_PREFIX] [--no-shared-fs] [--greediness GREEDINESS]
                 [--no-hooks] [--overwrite-shellcmd OVERWRITE_SHELLCMD] [--debug] [--runtime-profile FILE] [--mode {0,1,2}] [--show-failed-logs] [--log-handler-script FILE]
                 [--log-service {none,slack,wms}] [--cluster CMD | --cluster-sync CMD | --drmaa [ARGS]] [--cluster-config FILE] [--immediate-submit] [--jobscript SCRIPT] [--jobname NAME]
                 [--cluster-status CLUSTER_STATUS] [--drmaa-log-dir DIR] [--kubernetes [NAMESPACE]] [--container-image IMAGE] [--tibanna] [--tibanna-sfn TIBANNA_SFN] [--precommand PRECOMMAND]
                 [--tibanna-config TIBANNA_CONFIG [TIBANNA_CONFIG ...]] [--google-lifesciences] [--google-lifesciences-regions GOOGLE_LIFESCIENCES_REGIONS [GOOGLE_LIFESCIENCES_REGIONS ...]]
                 [--google-lifesciences-location GOOGLE_LIFESCIENCES_LOCATION] [--google-lifesciences-keep-cache] [--tes URL] [--use-conda] [--conda-not-block-search-path-envvars]
                 [--list-conda-envs] [--conda-prefix DIR] [--conda-cleanup-envs] [--conda-cleanup-pkgs [{tarballs,cache}]] [--conda-create-envs-only] [--conda-frontend {conda,mamba}]
                 [--use-singularity] [--singularity-prefix DIR] [--singularity-args ARGS] [--use-envmodules]
                 [target ...]
snakemake: error: unrecognized arguments: --rerun-triggers --retries 0 download_databases
10/26/2023 01:40:40 PM CRITICAL: Command '['snakemake', '--snakefile', '/usr/local/miniconda3/lib/python3.9/site-packages/aviary/modules/Snakefile', '--directory', '/Accounts/randerson/testing/max', '--cores', '16', '--rerun-incomplete', '--keep-going', '--rerun-triggers', 'mtime', '--configfile', '/Accounts/randerson/testing/max/config.yaml', '--nolock', '--retries', '0', '--conda-frontend', 'mamba', '--resources', 'mem_mb=256000', '--use-conda', '--conda-prefix', '/usr/local/miniconda3/', 'download_databases']' returned non-zero exit status 2.

Do you have any ideas for what might be causing this?

Thanks so much for any insights you may have! Thanks, Rika

rhysnewell commented 8 months ago

Hi Rika,

Thanks for trying out Aviary apologies the installation has not gone smoothly. That error seems a bit odd, would you be able to run snakmake --version from within the aviary conda environment you created?

Cheers, Rhys

rikander commented 8 months ago

Hi Rhys,

Sure thing! Here's what I get:

(aviary) [randerson@deming ~]$ snakemake --version
6.3.0

The only thing I can think of is that it's a permissions issue-- I've installed conda globally for all users following the instructions on the anaconda website, but I still run into issues occasionally, and it's possible this may be the result of that-- but I wasn't able to figure it out based on the error logs.

Thanks, Rika

rhysnewell commented 8 months ago

Okay, nice should be an easy fix. Looks like the minimum version for snakemake we have in our aviary.yml is incorrect and we need to bump it up. Can you try upgrading snakemake to something like 7.32.3?

rikander commented 8 months ago

Done!

(aviary) [randerson@deming max]$ snakemake --version
7.32.4

... but now I get a new error, which seems like it might be related to mamba.

(aviary) [randerson@deming max]$ aviary configure --conda-prefix /usr/local/miniconda3/ --gtdb-path /researchdrive/data/databases/release214/ --checkm2-db-path /researchdrive/data/databases/CHECKM2DB/ --eggnog-db-path /researchdrive/data/databases/EGGNOG_DATA_DIR/ --download
10/26/2023 10:50:21 PM INFO: Time - 22:50:21 26-10-2023
10/26/2023 10:50:21 PM INFO: Command - /usr/local/miniconda3/bin/aviary configure --conda-prefix /usr/local/miniconda3/ --gtdb-path /researchdrive/data/databases/release214/ --checkm2-db-path /researchdrive/data/databases/CHECKM2DB/ --eggnog-db-path /researchdrive/data/databases/EGGNOG_DATA_DIR/ --download
10/26/2023 10:50:21 PM INFO: Version - 0.8.2
10/26/2023 10:50:21 PM INFO: The current aviary environment variables are:
10/26/2023 10:50:21 PM INFO: CONDA_ENV_PATH: /usr/local/miniconda3/
10/26/2023 10:50:21 PM INFO: TMPDIR: /tmp
10/26/2023 10:50:21 PM INFO: GTDBTK_DATA_PATH: /researchdrive/data/databases/release214/
10/26/2023 10:50:21 PM INFO: EGGNOG_DATA_DIR: /researchdrive/data/databases/EGGNOG_DATA_DIR/
10/26/2023 10:50:21 PM INFO: CHECKM2DB: /researchdrive/data/databases/CHECKM2DB/
10/26/2023 10:50:21 PM INFO: Configuration file written to /Accounts/randerson/testing/max/config.yaml
10/26/2023 10:50:21 PM INFO: Executing: snakemake --snakefile /usr/local/miniconda3/lib/python3.9/site-packages/aviary/modules/Snakefile --directory /Accounts/randerson/testing/max --cores 16 --rerun-incomplete --keep-going  --rerun-triggers mtime --configfile /Accounts/randerson/testing/max/config.yaml --nolock  --retries 0 --conda-frontend mamba --resources mem_mb=256000   --use-conda --conda-prefix /usr/local/miniconda3/   download_databases
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Creating conda environment /usr/local/miniconda3/lib/python3.9/site-packages/aviary/modules/annotation/../../envs/gtdbtk.yaml...
Downloading and installing remote packages.
CreateCondaEnvironmentException:
Could not create conda environment from /usr/local/miniconda3/lib/python3.9/site-packages/aviary/modules/annotation/../../envs/gtdbtk.yaml:
Command:
mamba env create --quiet --file "/usr/local/miniconda3/9f2d1b326b986488aec3f48c33ab7b5a_.yaml" --prefix "/usr/local/miniconda3/9f2d1b326b986488aec3f48c33ab7b5a_"
Output:
Traceback (most recent call last):
  File "/usr/local/miniconda3/condabin/mamba", line 7, in <module>
    from mamba.mamba import main
  File "/usr/local/miniconda3/lib/python3.9/site-packages/mamba/mamba.py", line 16, in <module>
    from conda.cli.common import (
ImportError: cannot import name 'ensure_name_or_prefix' from 'conda.cli.common' (/usr/local/miniconda3/lib/python3.9/site-packages/conda/cli/common.py)

10/26/2023 10:50:25 PM CRITICAL: Command '['snakemake', '--snakefile', '/usr/local/miniconda3/lib/python3.9/site-packages/aviary/modules/Snakefile', '--directory', '/Accounts/randerson/testing/max', '--cores', '16', '--rerun-incomplete', '--keep-going', '--rerun-triggers', 'mtime', '--configfile', '/Accounts/randerson/testing/max/config.yaml', '--nolock', '--retries', '0', '--conda-frontend', 'mamba', '--resources', 'mem_mb=256000', '--use-conda', '--conda-prefix', '/usr/local/miniconda3/', 'download_databases']' returned non-zero exit status 1.
(aviary) [randerson@deming max]$ mamba --version
Traceback (most recent call last):
  File "/usr/local/miniconda3/condabin/mamba", line 7, in <module>
    from mamba.mamba import main
  File "/usr/local/miniconda3/lib/python3.9/site-packages/mamba/mamba.py", line 16, in <module>
    from conda.cli.common import (
ImportError: cannot import name 'ensure_name_or_prefix' from 'conda.cli.common' (/usr/local/miniconda3/lib/python3.9/site-packages/conda/cli/common.py)

I could try upgrading/installing mamba as well if that seems like an appropriate next step!

rhysnewell commented 8 months ago

Hmm, yeah kind of looks like a combined conda/mamba issue as you point out. You might need to check your versions for both conda and mamba and check they are compatible, or just update both to the their latest version if possible

rhysnewell commented 8 months ago

https://github.com/conda-forge/miniforge/issues/499

Looks like if you update to mamba 1.5.2 it should fix this issue

rhysnewell commented 8 months ago

Hi @rikander, just checking in to see if this issue was resolved?

rikander commented 8 months ago

Hi @rhysnewell , sorry for the silence-- we were trying to get some help on our end after things got confusing. We were able to get snakemake installed in the aviary conda environment via pip (that was the only method that worked for us). I also did end up installing mamba via miniforge. We think we finally got aviary configure to run without errors, so that resolves the original issue I raised here, so I think we can consider it closed. (I'm not seeing anything in the CheckM database folder, but that's a different issue that I'll work on separately.)

Thanks for your help!

rhysnewell commented 8 months ago

That's good news, sorry your installation process hasn't been smooth. Not sure why installing snakemake was a struggle, possibly something to do with conda channel priority? But glad you sorted it.

I think zenodo was having some transient issues over the past week. The checkm2 db is stored on zenodo, so if it failed to download then that is likely the issue. You could potentially try downloading again, or do it manually as suggested here: https://github.com/chklovski/CheckM2/issues/83#issuecomment-1767129760

I'll go ahead and close this now, let me know if you have any other issues