metagenome-atlas / atlas

ATLAS - Three commands to start analyzing your metagenome data
https://metagenome-atlas.github.io/
BSD 3-Clause "New" or "Revised" License
376 stars 98 forks source link

database download #179

Closed Mechah closed 5 years ago

Mechah commented 5 years ago

Hi, I receive an error when I'm trying to download the databases with: atlas download --db-dir databases/atlas/ on several systems.

KeyError in line 131 of /home/data/galaxy_tool_dependencies/_conda/envs/atlas@2.1/atlas/atlas/rules/download.snakefile: 'diamond_mem' File "/home/data/galaxy_tool_dependencies/_conda/envs/atlas@2.1/atlas/atlas/rules/download.snakefile", line 131, in

or

[2019-02-28 14:01 INFO] Executing: snakemake --snakefile /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile --printshellcmds --jobs 1 --rerun-incomplete --nolock --config database_dir='/home/cluster/o_mahnerta/databases/atlas' -- KeyError in line 131 of /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile: 'diamond_mem' File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile", line 131, in [2019-02-28 14:01 CRITICAL] Command 'snakemake --snakefile /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile --printshellcmds --jobs 1 --rerun-incomplete --nolock --config database_dir='/home/cluster/o_mahnerta/databases/atlas' -- ' returned non-zero exit status 1.

Thanks for any support!

SilasK commented 5 years ago

Should be solved in c 2.0.5

On Fri, Mar 1, 2019, 18:44 Mechah notifications@github.com wrote:

Hi, I receive an error when I'm trying to download the databases with: atlas download --db-dir databases/atlas/ on several systems.

KeyError in line 131 of /home/data/galaxy_tool_dependencies/_conda/envs/atlas@2.1 /atlas/atlas/rules/download.snakefile: 'diamond_mem' File "/home/data/galaxy_tool_dependencies/_conda/envs/atlas@2.1/atlas/atlas/rules/download.snakefile", line 131, in

or

[2019-02-28 14:01 INFO] Executing: snakemake --snakefile /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile --printshellcmds --jobs 1 --rerun-incomplete --nolock --config database_dir='/home/cluster/o_mahnerta/databases/atlas' -- KeyError in line 131 of /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile: 'diamond_mem' File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile", line 131, in [2019-02-28 14:01 CRITICAL] Command 'snakemake --snakefile /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile --printshellcmds --jobs 1 --rerun-incomplete --nolock --config database_dir='/home/cluster/o_mahnerta/databases/atlas' -- ' returned non-zero exit status 1.

Thanks for any support!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/metagenome-atlas/atlas/issues/179, or mute the thread https://github.com/notifications/unsubscribe-auth/AHLK2hOq0UCYMrm5An1UPg9-K_0ys9EMks5vSWb0gaJpZM4bZgNR .

SilasK commented 5 years ago

You run atlas init, right?

Something went wrong in automatically preparing the sample table. You can generate your own saples.tsv in the working directory, see the docs for more info.

Check that the config.yaml was created in the working directory.

For follow up can you send me the list of files you gave to atlas init?

e.g. atlas init path/to/fasta

Send me: ‘ls path/to/fasta’

PS: Are you making a galaxy tool?

On 4 Mar 2019, 10:14 +0100, Mechah notifications@github.com, wrote:

Dear Silas, we receive a new error with atlas version 2.0.5 Traceback (most recent call last): File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/bin/atlas", line 11, in sys.exit(cli()) File "/home/data/galaxy_tool_dependencies/_conda/envs/__metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 764, in call return self.main(*args, **kwargs) File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/data/galaxy_tool_dependencies/_conda/envs/__metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/lib/python3.6/site-packages/atlas/conf.py", line 240, in run_init prepare_sample_table(path_to_fastq,reads_are_QC=skip_qc,outfile=sample_file) File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/lib/python3.6/site-packages/atlas/conf.py", line 88, in prepare_sample_table assert len(columns) == 1, "expect columns to be only ['R1']" AssertionError: expect columns to be only ['R1'] Thanks for any help! — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

Mechah commented 5 years ago

Dear Silas,

at the moment we are still testing, but some of the tools we regularly use get then wrapped up into Galaxy. So maybe one day we will do that...

For now I got another error (see below) with atlas run after successful execution of atlas init. The software can't create the conda environment for DASTool. Any ideas how to move further?

Thanks, Alex

Could not create conda environment from /home/data/galaxy_tool_dependencies/_conda/envs/__metagenome-atlas@2.0.5 /lib/python3.6/site-packages/atlas/rules/../envs/DASTool.yaml: Fetching package metadata ................. Solving package specifications:

ResolvePackageNotFound:

[2019-03-04 17:07 CRITICAL] Command 'snakemake --snakefile /home/data/galaxy_tool_dependencies/_conda/envs/__metagenome-atlas@2.0.5/lib/python3.6/site-packages/atlas/Snakefile --directory /home/cluster/o_mahnerta/ATLAS_test_5 --printshellcmds --jobs 56 --rerun-incomplete --configfile '/home/cluster/o_mahnerta/ATLAS_test_5/config.yaml' --nolock --use-conda --conda-prefix /home/cluster/o_mahnerta/ATLAS_test_5/databases/conda_envs all ' returned non-zero exit status 1.

Am Mo., 4. März 2019 um 12:01 Uhr schrieb Silas Kieser < notifications@github.com>:

You run atlas init, right?

Something went wrong in automatically preparing the sample table. You can generate your own saples.tsv in the working directory, see the docs for more info.

Check that the config.yaml was created in the working directory.

For follow up can you send me the list of files you gave to atlas init?

e.g. atlas init path/to/fasta

Send me: ‘ls path/to/fasta’

PS: Are you making a galaxy tool?

On 4 Mar 2019, 10:14 +0100, Mechah notifications@github.com, wrote:

Dear Silas, we receive a new error with atlas version 2.0.5 Traceback (most recent call last): File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/bin/atlas", line 11, in sys.exit(cli()) File "/home/data/galaxy_tool_dependencies/_conda/envs/__metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 764, in call return self.main(*args, **kwargs) File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/data/galaxy_tool_dependencies/_conda/envs/__metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/data/galaxy_tool_dependencies/_conda/envs/metagenome-atlas@2.0.5/lib/python3.6/site-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "/home/data/galaxy_tool_dependencies/_conda/envs/__metagenome-atlas@2.0.5/lib/python3.6/site-packages/atlas/conf.py", line 240, in run_init

prepare_sample_table(path_to_fastq,reads_are_QC=skip_qc,outfile=sample_file) File "/home/data/galaxy_tool_dependencies/_conda/envs/__metagenome-atlas@2.0.5/lib/python3.6/site-packages/atlas/conf.py", line 88, in prepare_sample_table assert len(columns) == 1, "expect columns to be only ['R1']" AssertionError: expect columns to be only ['R1'] Thanks for any help! — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/metagenome-atlas/atlas/issues/179#issuecomment-469211622, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ9Rj3zNNf4i7_k1QDAYupyYJCIBieB4ks5vTPzvgaJpZM4bZgNR .

--

Dr. Alexander Mahnert

email: alexander.mahnert@medunigraz.at

Phone: 0043-316-385-72815

Interactive Microbiome Research

Department of Internal Medicine

Medical University of Graz

Center for Medical Research (ZMF)

Stiftingtalstrasse 24 https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

8036 Graz https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

Austria https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

SilasK commented 5 years ago

Have you added bioconde to your conda channels?

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

may be I should add this to the docs...

You can test it outside of atlas with:

conda create -n test_env  das_tool 
Mechah commented 5 years ago

Hi, as the channels could not be set globally on our cluster. I executed atlas on my test dataset locally. However, I got another error later on in the pipeline. Again - thanks for helping...

Error in rule initialize_checkm: jobid: 68 output: logs/checkm_init.txt log: logs/initialize_checkm.log conda-env: /home/cluster/o_mahnerta/atlas_testing/ATLAS_test_6/databases/conda_envs/41148902

RuleException: CalledProcessError in line 162 of /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile: Command 'source activate '/home/cluster/o_mahnerta/atlas_testing/ATLAS_test_6/databases/conda_envs/41148902'; set -euo pipefail; python /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/initialize_checkm.py

/home/cluster/o_mahnerta/atlas_testing/ATLAS_test_6/databases/checkm logs/checkm_init.txt logs/initialize_checkm.log' returned non-zero exit status 1. File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile", line 162, in __rule_initialize_checkm File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/concurrent/futures/thread.py", line 56, in run Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message

Am Di., 5. März 2019 um 13:05 Uhr schrieb Silas Kieser < notifications@github.com>:

Have you added bioconde to your conda channels?

conda config --add channels defaults conda config --add channels bioconda conda config --add channels conda-forge

may be I should add this to the docs...

You can test it outside of atlas with:

conda create -n test_env das_tool

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/metagenome-atlas/atlas/issues/179#issuecomment-469655724, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ9Rj37Rvsx3j_TGx8UumNH4FN9FbEITks5vTl15gaJpZM4bZgNR .

--

Dr. Alexander Mahnert

email: alexander.mahnert@medunigraz.at

Phone: 0043-316-385-72815

Interactive Microbiome Research

Department of Internal Medicine

Medical University of Graz

Center for Medical Research (ZMF)

Stiftingtalstrasse 24 https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

8036 Graz https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

Austria https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

SilasK commented 5 years ago

Error in rule initialize_checkm:

You run ‘atlas download’ ? I also encouraged this error, but it should be solved now on atlas v.2.0.5/2.0.6

By the way, you don’t need to run atlas download. You can also start ‘atlas run all’ on your test dataset and the databases are downloaded on the fly.

Beware that if the test dataset is too small you won’t get any bins. I’m trying to build y test dataset.

as the channels could not be set globally on our cluster.

Try to ask your IT-team to add the bioconda channel and condo-forge Chanel globally. Condo without this two channels is not worth much. May be google finds other other workarounds.

Mechah commented 5 years ago

Hi, I tested again and got another error with CheckM (see below). We are already one step further (10 of 109 steps 9% were accomplished) and CheckM wrote already some files. This time I run 'atlas run all' with version 2.0.6 and I used a test data set where I already got several genomic bins from (so it should not be too small). What I'm a bit confused of is that I get errors associated with CheckM right after quality control (step 1). However, from my point of view CheckM only makes sense after successful completion of an assembly (step 2) and binning - correct? I also want to double check with you my executed commands: after installation and activating the atlas environment I run 'atlas init /home/cluster/o_mahnerta/atlas_testing/raw_test_data_2/' and then from the respective working directory I just execute 'atlas run all'. So the software creates the environments, downloads the databases and should know where they are located - correct?

Looking forward to get further help!

[2019-03-06 10:27] Updating CheckM's data directory. Traceback (most recent call last): File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/initialize_checkm.py", line 89, in main(args.dbdir, args.confirmation, args.log) File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/initialize_checkm.py", line 69, in main run_popen(["checkm", "data", "setRoot"], [db_dir, db_dir], stderr=errlog) File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/initialize_checkm.py", line 59, in run_popen p = Popen(cmd, stdin=PIPE, stdout=PIPE, universal_newlines=True, stderr=stderr) File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/subprocess.py", line 709, in init restore_signals, start_new_session) File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/subprocess.py", line 1344, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'checkm': 'checkm' [Wed Mar 6 10:27:46 2019] Error in rule initialize_checkm: jobid: 68 output: logs/checkm_init.txt log: logs/initialize_checkm.log conda-env: /home/cluster/o_mahnerta/atlas_testing/ATLAS_test_7/databases/conda_envs/2fe68311

RuleException: CalledProcessError in line 162 of /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile: Command 'source activate '/home/cluster/o_mahnerta/atlas_testing/ATLAS_test_7/databases/conda_envs/2fe68311'; set -euo pipefail; python /home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/initialize_checkm.py

/home/cluster/o_mahnerta/atlas_testing/ATLAS_test_7/databases/checkm logs/checkm_init.txt logs/initialize_checkm.log' returned non-zero exit status 1. File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/site-packages/atlas/rules/download.snakefile", line 162, in __rule_initialize_checkm File "/home/cluster/o_mahnerta/miniconda3/envs/atlasenv/lib/python3.6/concurrent/futures/thread.py", line 56, in run Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message

Am Di., 5. März 2019 um 17:04 Uhr schrieb Silas Kieser < notifications@github.com>:

Error in rule initialize_checkm:

You run ‘atlas download’ ? I also encouraged this error, but it should be solved now on atlas v.2.0.5/2.0.6

By the way, you don’t need to run atlas download. You can also start ‘atlas run all’ on your test dataset and the databases are downloaded on the fly.

Beware that if the test dataset is too small you won’t get any bins. I’m trying to build y test dataset.

as the channels could not be set globally on our cluster.

Try to ask your IT-team to add the bioconda channel and condo-forge Chanel globally. Condo without this two channels is not worth much. May be google finds other other workarounds.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/metagenome-atlas/atlas/issues/179#issuecomment-469737634, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ9Rj8M226Oc0c_CEe0AV3_cp0bHIAkzks5vTpWNgaJpZM4bZgNR .

--

Dr. Alexander Mahnert

email: alexander.mahnert@medunigraz.at

Phone: 0043-316-385-72815

Interactive Microbiome Research

Department of Internal Medicine

Medical University of Graz

Center for Medical Research (ZMF)

Stiftingtalstrasse 24 https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

8036 Graz https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

Austria https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

SilasK commented 5 years ago

Probably this is related to #181, and this means your not the only one.

Mechah commented 5 years ago

Ok, unfortunately my logs/initialize_checkm.log file is just empty... Anything else that could be of help for further trouble shooting?

Am Mi., 6. März 2019 um 11:08 Uhr schrieb Silas Kieser < notifications@github.com>:

Probably this is related to #181 https://github.com/metagenome-atlas/atlas/issues/181, and this means your not the only one.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/metagenome-atlas/atlas/issues/179#issuecomment-470048153, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ9Rjwitww_C9WI_CwPsBdBBnnV0yFegks5vT5OjgaJpZM4bZgNR .

--

Dr. Alexander Mahnert

email: alexander.mahnert@medunigraz.at

Phone: 0043-316-385-72815

Interactive Microbiome Research

Department of Internal Medicine

Medical University of Graz

Center for Medical Research (ZMF)

Stiftingtalstrasse 24 https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

8036 Graz https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

Austria https://maps.google.com/?q=Stiftingtalstrasse+24+%0D%0A+8036+Graz+%0D%0A+Austria&entry=gmail&source=g

SilasK commented 5 years ago

The file permission errors are solved in the newest version (development version on the master branch). I recommend to remove all the checkm databases and to rerun the download.