rule initialize_checkm failed

ghost commented 6 years ago

I met this problem with error ' Waiting at most 10 seconds for missing files. Error in job initialize_checkm while creating output files.... The checkm-genome version is checkm-genome: 1.0.7-py35_0 bioconda --latency-wait has been set to 10 Please see the attached file. Looking forward to your instruction. Thank you. err.txt

SilasK commented 6 years ago

I have to look at this tomorrow. @camel315 You are running this on a cluster? which system? Do you have internet on the execution machine?

@brwnj I don't understand why checkm doesn't put out an error or a log file?

As a manual fix: have a look at the script atlas/atlas/rules/initialize_checkm.py there you can see the 2 coomands used to download the checkm databases.

ghost commented 6 years ago

@SilasK Yes, I am running this on a cluster. It has Suse Linux Enterprise 2011 with Load Sharing Facility (LSF) Batch System, more than 200 nodes, 3000 cores. The components of cluster are connected with very fast internet.

ghost commented 6 years ago

@SilasK I probably find the reason, but do not know how to fix. I downloaded the checkM data manually at https://data.ace.uq.edu.au/public/CheckM_databases/. This process took around 20 minutes in my case. However, when the pipeline runs, these files will be automatically deleted, thus only empty fold left. When checkm need these database, it will report error and stop running.

SilasK commented 6 years ago

so atlas tries to overwrites the files, but doesn't download them?

what happens when you use the commands checkm data setRootto set the folder and then checkm data update to download the data? Which error do you get?

ghost commented 6 years ago

@SilasK How can I do this test, in initialize_checkm.py or assemble.snakefile? The original scripts: run_popen(["checkm", "data", "setRoot"], [snakemake.params.database_dir, snakemake.params.database_dir]) run_popen(["checkm", "data", "update"], ["y", "y"]) Shall I change to: run_popen(["checkm", "data", "setRoot"], [snakemake.params.database_dir, snakemake.params.database_dir, snakemake.params.database_dir]) run_popen(["checkm", "data", "update"], ["y", "y", "y"]) And in the latest version of checkM on github, I did not find the initialize_checkm.py script.

SilasK commented 6 years ago

In #64 I added a log file to the checkm_init rule. If you use the corresponding branch you can test again and see why atlas doesn't work.

git clone https://github.com/pnnl/atlas.git
cd atlas
git checkout assembly
python setup.py install develop

I'm sorry, but as I understood checkm is very complicated to integrate in a pipeline.

ghost commented 6 years ago

new errors: Executing: snakemake --snakefile /home/syang/anaconda3/lib/python3.6/site-packages/atlas/Snakefile --directory /panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang --printshellcmds --jobs 40 --rerun-incomplete --configfile '/panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang/config.yaml' --nolock --use-conda --config workflow=complete --latency-wait 10 WorkflowError in line 159 of /home/syang/anaconda3/lib/python3.6/site-packages/atlas/Snakefile: Failed to open /home/syang/anaconda3/lib/python3.6/site-packages/atlas/rules/qc.snakefile. File "/home/syang/anaconda3/lib/python3.6/site-packages/atlas/Snakefile", line 159, in [2017-12-07 11:56 CRITICAL] Command 'snakemake --snakefile /home/syang/anaconda3/lib/python3.6/site-packages/atlas/Snakefile --directory /panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang --printshellcmds --jobs 40 --rerun-incomplete --configfile '/panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang/config.yaml' --nolock --use-conda --config workflow=complete --latency-wait 10' returned non-zero exit status 1.

SilasK commented 6 years ago

I recently split the assemble snakefile in two: qc.snakefile and assemble.snakefile. Somehow the new snakefile didn't got isntalled:

/home/syang/anaconda3/lib/python3.6/site-packages/atlas/rules/qc.snakefile

check if in the git repository you donwnloaded there is the atlas/rules/qc.snakefile may be uninstall and reinstall atlas..

ghost commented 6 years ago

After I reinstalled atlas, there is still the problem of initialize_checkm: Conda environment defines Python version < 3.3. Using Python of the master process to execute script. /home/syang/anaconda3/bin/python /home/syang/anaconda3/lib/python3.6/site-packages/atlas/rules/.snakemake.waps8v21.initialize_checkm.py

[CheckM - data] Check for database updates. [setRoot]

Data location successfully changed to: /panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang/databases/checkm

[CheckM - data] Check for database updates. [update]

Waiting at most 10 seconds for missing files. Error in job initialize_checkm while creating output files /panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang/databases/checkm/test_data/637000110.fna, .... MissingOutputException in line 476 of /home/syang/anaconda3/lib/python3.6/site-packages/atlas/rules/assemble.snakefile: Missing files after 10 seconds: ... This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait. Removing output files of failed job initialize_checkm since they might be corrupted: /panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang/databases/checkm/.dmanifest, logs/checkm_init.txt Will exit after finishing currently running jobs. Finished job 184. 1 of 241 steps (0.41%) done Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message

SilasK commented 6 years ago

Now can you send me the log file: working_dir/logs/initialize_checkm.log

ghost commented 6 years ago

@SilasK Sorry to disturb you. Regarding the development of atlas as you suggsted: git clone https://github.com/pnnl/atlas.git cd atlas git checkout assembly python setup.py install develop

It will generate a new folder 'atlas', which is not the same as the installation with 'pip install -U pnnl-atlas' or with 'pip install git+https://github.com/pnnl/atlas.git'. In this case, does the path will automatically direct to the new atlas? I am bit confused in this point.

SilasK commented 6 years ago

pip uninstall pnnl-atlas

should uninstall the installed version of atlas.

Edit: fixed the command to represent package name in pypi.

ghost commented 6 years ago

@SilasK, should be pip uninstall pnnl-atlas. I have removed all packages and reinstalled from conda. All the .py and .snakemake files from your atlas folder were manually copied to the folder where atlas (under anaconda3) was installed with pip install -U pnnl-atlas. Error popped up as (snakemake version 4.3.1):

 Executing: snakemake --snakefile /home/syang/anaconda3/lib/python3.6/site-packages/atlas/Snakefile --directory /panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang --printshellcmds --jobs 24 --rerun-incomplete --configfile '/panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang/config.yaml' --nolock --use-conda  --config workflow=complete  --latency-wait 20
wildcard constraints in inputs are ignored
wildcard constraints in inputs are ignored
wildcard constraints in inputs are ignored
wildcard constraints in inputs are ignored
wildcard constraints in inputs are ignored
wildcard constraints in inputs are ignored
Building DAG of jobs...
Creating conda environment /home/syang/anaconda3/lib/python3.6/site-packages/atlas/envs/required_packages.yaml...
Traceback (most recent call last):
  File "/home/syang/anaconda3/lib/python3.6/site-packages/snakemake/conda.py", line 161, in create
    stderr=subprocess.STDOUT)
  File "/home/syang/anaconda3/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/home/syang/anaconda3/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['conda', 'env', 'create', '--file', '/home/syang/anaconda3/lib/python3.6/site-packages/atlas/envs/required_packages.yaml', '--prefix', '/panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang/.snakemake/conda/1765b780']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/syang/anaconda3/lib/python3.6/site-packages/snakemake/__init__.py", line 520, in snakemake
    cluster_status=cluster_status)
  File "/home/syang/anaconda3/lib/python3.6/site-packages/snakemake/workflow.py", line 518, in execute
    dag.create_conda_envs(dryrun=dryrun)
  File "/home/syang/anaconda3/lib/python3.6/site-packages/snakemake/dag.py", line 172, in create_conda_envs
    env.create(dryrun)
  File "/home/syang/anaconda3/lib/python3.6/site-packages/snakemake/conda.py", line 170, in create
    e.output.decode())
snakemake.exceptions.CreateCondaEnvironmentException: Could not create conda environment from /home/syang/anaconda3/lib/python3.6/site-packages/atlas/envs/required_packages.yaml:
Fetching package metadata ...Using Anaconda API: https://api.anaconda.org

CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/bioconda/linux-64/repodata.json>
Elapsed: -

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
ConnectionError(MaxRetryError("HTTPSConnectionPool(host='conda.anaconda.org', port=443): Max retries exceeded with url: /bioconda/linux-64/repodata.json (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x2b48b4b6aa20>: Failed to establish a new connection: [Errno -2] Name or service not known',))",),)

[2017-12-08 12:17 CRITICAL] Command 'snakemake --snakefile /home/syang/anaconda3/lib/python3.6/site-packages/atlas/Snakefile --directory /panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang --printshellcmds --jobs 24 --rerun-incomplete --configfile '/panfs/panfs14.gfz-hpcc.cluster/home/gmb/syang/config.yaml' --nolock --use-conda  --config workflow=complete  --latency-wait 20' returned non-zero exit status 1.

SilasK commented 6 years ago

@camel315 might it be that you don't have internet access on the server? You can also try to run atlas on your local server with the additional attributes --until initialize_checkm . This should only execute the one step.

Edit: updated the --until command to the correct spelling of the rule.

ghost commented 6 years ago

@SilasK It was running on cluster. I can download the atlas file from git, e.g. syang@glic2: git clone https://github.com/pnnl/atlas.git Cloning into 'atlas'... remote: Counting objects: 2679, done. remote: Compressing objects: 100% (164/164), done. remote: Total 2679 (delta 207), reused 189 (delta 122), pack-reused 2393 Receiving objects: 100% (2679/2679), 4.99 MiB | 2.80 MiB/s, done. Resolving deltas: 100% (1686/1686), done.

SilasK commented 6 years ago

It seems you don't have internet acces:

CondaHTTPError: HTTP 000 CONNECTION FAILED for url https://conda.anaconda.org/bioconda/linux-64/repodata.json

What happens when you put:

conda env create --file /home/syang/anaconda3/lib/python3.6/site-packages/atlas/envs/required_packages.yaml

ghost commented 6 years ago

@SilasK conda env create --file /home/syang/anaconda3/lib/python3.6/site-packages/atlas/envs/required_packages.yaml Using Anaconda API: https://api.anaconda.org

CondaValueError: prefix already exists: /home/syang/anaconda3

SilasK commented 6 years ago

Can you answer my question if you have internet on the executing cluster machine?

SilasK commented 6 years ago

Starting again, create a new conda environement: conda create -n atlas_env -c bioconda python=3.5 snakemake bbmap=37.17 click

source activate atlas_env
git clone https://github.com/pnnl/atlas.git
cd atlas
git checkout assembly
python setup.py install develop

now you have a fresh atlas version in the conda environment atlas_env.

You can also try to run atlas on your local server with the additional attributes --until init_checkm

ghost commented 6 years ago

@SilasK I still did not get my problem solved and unable to determine the source of the problem. Will it be related to a problem of conda? For example 'optional_genome_binning.yaml', the package 'maxbin2=2.2.1=r3.3.2_1', could not be found with 'conda search -c bioconda maxbin2'. Will this check cause error in running the conda?

SilasK commented 6 years ago

Do you have internet on the executing machines?

Can you install maxbin using the file: conda create -n maxbin2_env --file optional_genome_binning.yaml or so.

Try to run atlas without the genome binning so you get already the contigs an everything.

ghost commented 6 years ago

@SilasK For the maxbin2 and related fraggenescan, I could get positive feedback like conda search -c bioconda maxbin2, but I could not get it installed like conda install -c bioconda maxbin2.

syang@glic2:~> conda search -c bioconda fraggenescan
Fetching package metadata .............
fraggenescan                 1.30                 pl5.22.0_0  bioconda        
                             1.30                 pl5.22.0_1  bioconda        
syang@glic2:~> conda install -c bioconda fraggenescan
Fetching package metadata .............
Solving package specifications: 

PackageNotFoundError: Packages missing in current channels:

  - fraggenescan -> perl 5.22.0*

We have searched for the packages in the following channels:

  - https://conda.anaconda.org/bioconda/linux-64
  - https://conda.anaconda.org/bioconda/noarch
  - https://repo.continuum.io/pkgs/main/linux-64
  - https://repo.continuum.io/pkgs/main/noarch
  - https://repo.continuum.io/pkgs/free/linux-64
  - https://repo.continuum.io/pkgs/free/noarch
  - https://repo.continuum.io/pkgs/r/linux-64
  - https://repo.continuum.io/pkgs/r/noarch
  - https://repo.continuum.io/pkgs/pro/linux-64
  - https://repo.continuum.io/pkgs/pro/noarch

syang@glic2:~> conda search -c bioconda maxbin2
Fetching package metadata .............
maxbin2                      2.2.1                         0  bioconda        
                             2.2.1                  r3.3.1_1  bioconda        
                             2.2.1                  r3.3.2_1  bioconda        
                             2.2.1                  r3.4.1_1  bioconda        
                             2.2.4                  r3.4.1_0  bioconda        
syang@glic2:~> conda install -c bioconda maxbin2
Fetching package metadata .............
Solving package specifications: 

PackageNotFoundError: Packages missing in current channels:

  - maxbin2 -> fraggenescan >=1.30 -> perl 5.22.0*

We have searched for the packages in the following channels:

  - https://conda.anaconda.org/bioconda/linux-64
  - https://conda.anaconda.org/bioconda/noarch
  - https://repo.continuum.io/pkgs/main/linux-64
  - https://repo.continuum.io/pkgs/main/noarch
  - https://repo.continuum.io/pkgs/free/linux-64
  - https://repo.continuum.io/pkgs/free/noarch
  - https://repo.continuum.io/pkgs/r/linux-64
  - https://repo.continuum.io/pkgs/r/noarch
  - https://repo.continuum.io/pkgs/pro/linux-64
  - https://repo.continuum.io/pkgs/pro/noarch

I am not sure whether this is the reason of conda HTTP error

with conda create -n maxbin2_env --file /home/syang/anaconda3/lib/python3.6/site-packages/atlas/envs/optional_genome_binning.yaml

CondaValueError: could not parse '- python=2.7' in: /home/syang/anaconda3/lib/python3.6/site-packages/atlas/envs/optional_genome_binning.yaml

SilasK commented 6 years ago

@camel315 I don't know why you can't parse the file. I assume you also need the conda-forge and may be the r channel. conda install -c bioconda -c conda-forge -c r maxbin2

ghost commented 6 years ago

@SilasK I have skipped using cluster and back to our small workstation. The atlas worked without CondaHTTP error. But the initialize_checkm error still exists. The checkM files which were manually released from checkm_data_16012015_v0.9.7.tar.gz, were removed but not generated. Please see the message below:

$ cat logs/initialize_checkm.log 

*******************************************************************************
 [CheckM - data] Check for database updates. [setRoot]
*******************************************************************************

Data location successfully changed to: /home/syang/databases/checkm

*******************************************************************************
 [CheckM - data] Check for database updates. [update]
*******************************************************************************

**The following error message was automatically printed in the screen:**

stats.sh in=OD3/assembly/OD3_prefilter_contigs.fasta format=3 -Xmx16G > OD3/assembly/contig_stats/prefilter_contig_stats.txt                                             
Finished job 17.
3 of 176 steps (2%) done
Waiting at most 5 seconds for missing files.
Error in job initialize_checkm while creating output files /home/syang/databases/checkm/test_data/637000110.fna, /home/syang/databases/checkm/taxon_marker_sets.tsv, /home/syang/databases/checkm/selected_marker_sets.tsv, /home/syang/databases/checkm/pfam/tigrfam2pfam.tsv, /home/syang/databases/checkm/pfam/Pfam-A.hmm.dat, /home/syang/databases/checkm/img/img_metadata.tsv, /home/syang/databases/checkm/hmms_ssu/SSU_euk.hmm, /home/syang/databases/checkm/hmms_ssu/SSU_bacteria.hmm, /home/syang/databases/checkm/hmms_ssu/SSU_archaea.hmm, /home/syang/databases/checkm/hmms_ssu/createHMMs.py, /home/syang/databases/checkm/hmms/phylo.hmm.ssi, /home/syang/databases/checkm/hmms/phylo.hmm, /home/syang/databases/checkm/hmms/checkm.hmm.ssi, /home/syang/databases/checkm/hmms/checkm.hmm, /home/syang/databases/checkm/genome_tree/missing_duplicate_genes_97.tsv, /home/syang/databases/checkm/genome_tree/missing_duplicate_genes_50.tsv, /home/syang/databases/checkm/genome_tree/genome_tree.taxonomy.tsv, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/phylo_modelJqWx6_.json, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.tre, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.log, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.fasta, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/CONTENTS.json, /home/syang/databases/checkm/genome_tree/genome_tree.metadata.tsv, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/phylo_modelEcOyPk.json, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.tre, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.log, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.fasta, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/CONTENTS.json, /home/syang/databases/checkm/genome_tree/genome_tree.derep.txt, /home/syang/databases/checkm/.dmanifest, /home/syang/databases/checkm/distributions/td_dist.txt, /home/syang/databases/checkm/distributions/gc_dist.txt, /home/syang/databases/checkm/distributions/cd_dist.txt, logs/checkm_init.txt.         
MissingOutputException in line 521 of /home/syang/miniconda3/lib/python3.5/site-packages/atlas/rules/assemble.snakefile:                                                  
Missing files after 5 seconds:                                                       
/home/syang/databases/checkm/test_data/637000110.fna                                 
/home/syang/databases/checkm/taxon_marker_sets.tsv                                   
/home/syang/databases/checkm/selected_marker_sets.tsv                                
/home/syang/databases/checkm/pfam/tigrfam2pfam.tsv                                   
/home/syang/databases/checkm/pfam/Pfam-A.hmm.dat                                     
/home/syang/databases/checkm/img/img_metadata.tsv                                    
/home/syang/databases/checkm/hmms_ssu/SSU_euk.hmm                                    
/home/syang/databases/checkm/hmms_ssu/SSU_bacteria.hmm                               
/home/syang/databases/checkm/hmms_ssu/SSU_archaea.hmm                                
/home/syang/databases/checkm/hmms_ssu/createHMMs.py                                  
/home/syang/databases/checkm/hmms/phylo.hmm.ssi                                      
/home/syang/databases/checkm/hmms/phylo.hmm                                          
/home/syang/databases/checkm/hmms/checkm.hmm.ssi                                     
/home/syang/databases/checkm/hmms/checkm.hmm                                         
/home/syang/databases/checkm/genome_tree/missing_duplicate_genes_97.tsv              
/home/syang/databases/checkm/genome_tree/missing_duplicate_genes_50.tsv              
/home/syang/databases/checkm/genome_tree/genome_tree.taxonomy.tsv                    
/home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/phylo_modelJqWx6_.json                                                                                
/home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.tre  
/home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.log  
/home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.fasta
/home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/CONTENTS.json    
/home/syang/databases/checkm/genome_tree/genome_tree.metadata.tsv                    
/home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/phylo_modelEcOyPk.json                                                                                   
/home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.tre
/home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.log
/home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.fasta
/home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/CONTENTS.json
/home/syang/databases/checkm/genome_tree/genome_tree.derep.txt
/home/syang/databases/checkm/distributions/td_dist.txt
/home/syang/databases/checkm/distributions/gc_dist.txt
/home/syang/databases/checkm/distributions/cd_dist.txt
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Removing output files of failed job initialize_checkm since they might be corrupted:
/home/syang/databases/checkm/.dmanifest, logs/checkm_init.txt
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message

brwnj commented 6 years ago

Implementing CheckM wasn't as easy as it should be, so I apologize for your continued headaches and appreciate your patience.

My recommendation would be to delete the checkm database directory -- in your latest instance that's /home/syang/databases/checkm. Also delete /logs/checkm_init.txt. Re-running atlas assemble will now re-run rule initialize_checkm and attempt to re-download the checkm reference databases.

If you don't have internet access on compute nodes, running:

atlas assemble --jobs 24 --out-dir results config.yaml --create-envs-only

ghost commented 6 years ago

@brwnj Following your suggestion, I run on cluster with

bsub -n 24 -q qintel -e err.txt -o out.txt atlas assemble --jobs 24 --out-dir results /home/syang/config.yaml --create-envs-only True --latency-wait 20

It reported: Executing: snakemake --snakefile /home/syang/anaconda3/lib/python3.5/site-packages/atlas/Snakefile --directory /home/syang/results --printshellcmds --jobs 24 --rerun-incomplete --configfile '/home/syang/config.yaml' --nolock --use-conda --config workflow=complete --create-envs-only --latency-wait 20 usage: snakemake [-h] [--snakefile FILE] [--gui [PORT]] [--cores [N]] [--local-cores N] [--resources [NAME=INT [NAME=INT ...]]] [--config [KEY=VALUE [KEY=VALUE ...]]] [--configfile FILE] [--list] [--list-target-rules] [--directory DIR] [--dryrun] [--printshellcmds] [--debug-dag] [--dag] [--force-use-threads] [--rulegraph] [--d3dag] [--summary] [--detailed-summary] [--archive FILE] [--touch] [--keep-going] [--force] [--forceall] [--forcerun [TARGET [TARGET ...]]] [--prioritize TARGET [TARGET ...]] [--until TARGET [TARGET ...]] [--omit-from TARGET [TARGET ...]] [--allow-ambiguity] [--cluster CMD | --cluster-sync CMD | --drmaa [ARGS]] [--drmaa-log-dir DIR] [--cluster-config FILE] [--immediate-submit] [--jobscript SCRIPT] [--jobname NAME] [--reason] [--stats FILE] [--nocolor] [--quiet] [--nolock] [--unlock] [--cleanup-metadata FILE [FILE ...]] [--rerun-incomplete] [--ignore-incomplete] [--list-version-changes] [--list-code-changes] [--list-input-changes] [--list-params-changes] [--latency-wait SECONDS] [--wait-for-files [FILE [FILE ...]]] [--benchmark-repeats N] [--notemp] [--keep-remote] [--keep-target-files] [--keep-shadow] [--allowed-rules ALLOWED_RULES [ALLOWED_RULES ...]] [--max-jobs-per-second MAX_JOBS_PER_SECOND] [--restart-times RESTART_TIMES] [--timestamp] [--greediness GREEDINESS] [--no-hooks] [--print-compilation] [--overwrite-shellcmd OVERWRITE_SHELLCMD] [--verbose] [--debug] [--profile FILE] [--mode {0,1,2}] [--bash-completion] [--use-conda] [--conda-prefix DIR] [--wrapper-prefix WRAPPER_PREFIX] [--default-remote-provider {S3,GS,SFTP,S3Mocked}] [--default-remote-prefix DEFAULT_REMOTE_PREFIX] [--version] [target [target ...]] snakemake: error: unrecognized arguments: --create-envs-only

I check 'create-envs-only', which is a parameter of snakemake API, how can I use correctly here?

brwnj commented 6 years ago

You likely just need to update snakemake. The latest available version on bioconda today is 4.3.1.

I thought before you determined that the compute nodes did NOT have access to the internet. The command line above should be run directly on the head node (or the node that has outside internet connectivity).

ghost commented 6 years ago

@brwnj The running in our own workstation failed because the checkm still want to initialize check. Thus the same error pop up 'Error in job initialize_checkm while creating output files' Cannot find ID3/assembly/opts.txt Please check whether the output directory is correctly set by "-o" Now switching to normal mode. MEGAHIT v1.1.2 --- [Thu Dec 14 23:31:08 2017] Start assembly. Number of CPU threads 8 --- --- [Thu Dec 14 23:31:08 2017] Available memory: 25065529344, used: 50000000000 --- [Thu Dec 14 23:31:08 2017] Converting reads to binaries --- b' [read_lib_functions-inl.h : 209] Lib 0 (ID3/assembly/reads/normalized.errorcorr.merged_R1.fastq.gz): se, 3209514 reads, 151 max length' b' [utils.h : 126] Real: 8.4324\tuser: 2.9971\tsys: 0.3602\tmaxrss: 162404' .... --- [Thu Dec 14 23:44:43 2017] Building graph for k = 121 --- --- [Thu Dec 14 23:44:50 2017] Assembling contigs from SdBG for k = 121 --- --- [Thu Dec 14 23:45:38 2017] Merging to output final contigs --- --- [STAT] 48932 contigs, total 45512920 bp, min 500 bp, max 16287 bp, avg 930 bp, N50 944 bp --- [Thu Dec 14 23:45:38 2017] ALL DONE. Time elapsed: 870.849289 seconds --- Removing temporary output file ID3/assembly/reads/normalized.errorcorr.merged_R1.fastq.gz. Removing temporary output file ID3/assembly/reads/normalized.errorcorr.merged_R2.fastq.gz. Removing temporary output file ID3/assembly/reads/normalized.errorcorr.merged_se.fastq.gz. Finished job 181. 4 of 173 steps (2%) done Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message

2, for the cluster, all the jobs are submitted to queues administration and then assigned to computing nodes. I am not sure whether I understood your solution correctly. After updating snakemake to 4.3.1, it still showed 'Creating conda environment /home/syang/anaconda3/lib/python3.5/site-packages/atlas/envs/required_packages.yaml...... CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/bioconda/linux-64/repodata.json...... Command 'snakemake --snakefile ... --nolock --use-conda --config workflow=complete --create-envs-only --latency-wait 20' returned non-zero exit status 1'

brwnj commented 6 years ago

CheckM can't be run (easily) without using conda, but an internet connection is only required once. Your compute nodes appear to not have internet, so running as you are you will continue running into the same error telling you that you don't have internet.

My suggestion was to run the command on a login node or head node, basically one of them that is capable of an outside connection. So, rather than:

bsub -n 24 -q qintel -e err.txt -o out.txt atlas assemble --jobs 24 --out-dir results /home/syang/config.yaml --create-envs-only True --latency-wait 20

It would be simply:

atlas assemble --jobs 24 --out-dir results /home/syang/config.yaml --create-envs-only --latency-wait 20

ghost commented 6 years ago

@brwnj Thank you for your solution. The pipeline starts working now.

ghost commented 6 years ago

@brwnj Sorry to disturb you again with initialize_checkm error:

Finished job 85. 39 of 241 steps (16%) done

localrule postprocess_after_decontamination: input: OD3/sequence_quality_control/OD3_clean_R1.fastq.gz, OD3/sequence_quality_control/OD3_clean_R2.fastq.gz, OD3/sequence_quality_control/OD3_clean_se.fastq.gz output: OD3/sequence_quality_control/OD3_QC_R1.fastq.gz, OD3/sequence_quality_control/OD3_QC_R2.fastq.gz, OD3/sequence_quality_control/OD3_QC_se.fastq.gz jobid: 88 wildcards: sample=OD3

localrule initialize_checkm: output: /home/syang/databases/checkm/test_data/637000110.fna, /home/syang/databases/checkm/taxon_marker_sets.tsv, /home/syang/databases/checkm/selected_marker_sets.tsv, /home/syang/databases/checkm/pfam/tigrfam2pfam.tsv, /home/syang/databases/checkm/pfam/Pfam-A.hmm.dat, /home/syang/databases/checkm/img/img_metadata.tsv, /home/syang/databases/checkm/hmms_ssu/SSU_euk.hmm, /home/syang/databases/checkm/hmms_ssu/SSU_bacteria.hmm, /home/syang/databases/checkm/hmms_ssu/SSU_archaea.hmm, /home/syang/databases/checkm/hmms_ssu/createHMMs.py, /home/syang/databases/checkm/hmms/phylo.hmm.ssi, /home/syang/databases/checkm/hmms/phylo.hmm, /home/syang/databases/checkm/hmms/checkm.hmm.ssi, /home/syang/databases/checkm/hmms/checkm.hmm, /home/syang/databases/checkm/genome_tree/missing_duplicate_genes_97.tsv, /home/syang/databases/checkm/genome_tree/missing_duplicate_genes_50.tsv, /home/syang/databases/checkm/genome_tree/genome_tree.taxonomy.tsv, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/phylomodelJqWx6.json, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.tre, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.log, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/genome_tree.fasta, /home/syang/databases/checkm/genome_tree/genome_tree_reduced.refpkg/CONTENTS.json, /home/syang/databases/checkm/genome_tree/genome_tree.metadata.tsv, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/phylo_modelEcOyPk.json, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.tre, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.log, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/genome_tree.fasta, /home/syang/databases/checkm/genome_tree/genome_tree_full.refpkg/CONTENTS.json, /home/syang/databases/checkm/genome_tree/genome_tree.derep.txt, /home/syang/databases/checkm/.dmanifest, /home/syang/databases/checkm/distributions/td_dist.txt, /home/syang/databases/checkm/distributions/gc_dist.txt, /home/syang/databases/checkm/distributions/cd_dist.txt, logs/checkm_init.txt log: logs/initialize_checkm.log jobid: 130

Conda environment defines Python version < 3.5. Using Python of the master process to execute script. Note that this cannot be avoided, because the script uses data structures from Snakemake which are Python >=3.5 only. /home/syang/anaconda3/bin/python /home/syang/anaconda3/lib/python3.5/site-packages/atlas/rules/.snakemake.1dkqwxrq.initialize_checkm.py Activating conda environment /home/syang/results/.snakemake/conda/11f0e3ea. Removing temporary output file OD3/sequence_quality_control/OD3_clean_R1.fastq.gz. Removing temporary output file OD3/sequence_quality_control/OD3_clean_R2.fastq.gz. Removing temporary output file OD3/sequence_quality_control/OD3_clean_se.fastq.gz. Finished job 88. 40 of 241 steps (17%) done Waiting at most 20 seconds for missing files. Removing output files of failed job initialize_checkm since they might be corrupted: /home/syang/databases/checkm/.dmanifest, logs/checkm_init.txt Will exit after finishing currently running jobs.

results/logs/initialize_checkm.log

[CheckM - data] Check for database updates. [setRoot]

Data location successfully changed to: /home/syang/databases/checkm

[CheckM - data] Check for database updates. [update]

Exiting because a job execution failed. Look above for error message Complete log: /home/syang/.snakemake/log/2017-12-15T111045.335026.snakemake.log [2017-12-15 14:06 CRITICAL] Command 'snakemake --snakefile /home/syang/anaconda3/lib/python3.5/site-packages/atlas/Snakefile --directory /home/syang/results --printshellcmds --jobs 24 --rerun-incomplete --configfile '/home/syang/config.yaml' --nolock --use-conda --config workflow=complete --latency-wait 20' returned non-zero exit status 1

brwnj commented 6 years ago

I anticipated this occurring as this step also needs an internet connection. Now that these data are processed to this point, we need to run another command on the head node and the rule target cannot have wildcards. It'll be something like what Silas referred to earlier:

atlas assemble --jobs 24 --out-dir results /home/syang/config.yaml --latency-wait 20 logs/checkm_init.txt

The file specification on the end is telling Snakemake that we only want to build this file, so it should have minimal impact on the head node.

ghost commented 6 years ago

@brwnj I followed your suggestion and the localrule initialize_checkm also took place. This error also occurred in our group workstation which is for sure with good internet connection. Thus, there should be some other reasons or bugs.

brwnj commented 6 years ago

I've been going through the code this week and hope to get around to taking another look at the bin validation step soon. In the meantime, you could always set perform_genome_binning: false in your configuration. All other steps besides binning and checkm will run.

ghost commented 6 years ago

@brwnj Thank you for your fast reply. I am running the rest of the pipeline according to your suggestion. By the way, is it possible to include the analysis of virus, fungus and protozoa in this magic pipeline? Will the the modification very complicated and cover more packages? Thank you.

brwnj commented 6 years ago

It does take a fair amount of effort to add things like that. I've considered adding a kmer-based annotation protocol as maybe an alternate annotation. If the method is reasonably fast I would implement it to run as a default protocol. Adding something like https://github.com/bioinformatics-centre/kaiju is what I'm thinking. Then the user would choose the alternate annotation database from their offerings:

refseq
progenomes
nr (archaea, bacteria, fungi and microbial eukaryotes)

Virus references can be added to any of the above 3.

We're trying to wrap up the paper soon, so this will likely be added to a development branch when it's started.

ghost commented 6 years ago

@brwnj Sorry to disturb you again. Still the question about checkm. For instance, due to security reasons, the computing cluster nodes are in private network and have no connection to outside servers like conda.anconda.org. In this case, I would like to ask whether the 'rule intialize_checkm' is mandatory for processing each sample for the remaining jobs? If not, is it possible to modify the script to only run once in the head nodes? In addition, I checked the github page of checkm require python<3.0. Does this point also make it not easy to be incorporated to the pipeline?

In addition, I met a error related to qc.snakefile. I am not sure whether it is also related to checkm and 'perform_genome_binning: false'

Error in rule calculate_insert_size: jobid: 91 output: OD3/sequence_quality_control/read_stats/QC_insert_size_hist.txt, OD3/sequence_quality_control/read_stats/QC_read_length_hist.txt log: OD3/logs/OD3_calculate_insert_size.log

RuleException: CalledProcessError in line 407 of /home/syang/anaconda3/lib/python3.5/site-packages/atlas/rules/qc.snakefile: Command 'source activate /home/syang/results/.snakemake/conda/dbc7d302; set -euo pipefail; bbmerge.sh -Xmx32G threads=24 in1=OD3/sequence_quality_control/OD3_QC_R1.fastq.gz in2=OD3/sequence_quality_control/OD3_QC_R2.fastq.gz loose ecct k=62 extend2=50 ihist=OD3/sequence_quality_control/read_stats/QC_insert_size_hist.txt merge=f mininsert0=35 minoverlap0=8 2> >(tee OD3/logs/OD3_calculate_insert_size.log)

            readlength.sh in=OD3/sequence_quality_control/OD3_QC_R1.fastq.gz in2=OD3/sequence_quality_control/OD3_QC_R2.fastq.gz out=OD3/sequence_quality_control/read_stats/QC_read_length_hist.txt 2> >(tee OD3/logs/OD3_calculate_insert_size.log) ' returned non-zero exit status 1

File "/home/syang/anaconda3/lib/python3.5/site-packages/atlas/rules/qc.snakefile", line 407, in __rule_calculate_insert_size File "/home/syang/anaconda3/lib/python3.5/concurrent/futures/thread.py", line 55, in run Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message

SilasK commented 6 years ago

You can run only the step needing internet on your head node by atlas assemble -R intialize_checkm

can you send us the OD3/logs/OD3_calculate_insert_size.log

ghost commented 6 years ago

@SilasK To avoid checkm-initialize, I switched off the block of rule initialize_checkm in assemble.snakemake. So far, the OD3/logs/OD3_calculate_insert_size.log looks like:

java -Djava.library.path=/home/syang/results/.snakemake/conda/dbc7d302/opt/bbmap-37.17/jni/ -ea -Xmx32G -Xms32G -cp /home/syang/results/.snakemake/conda/dbc7d302/opt/bbmap-37.17/current/ jgi.BBMerge -Xmx32G threads=24 in1=OD3/sequence_quality_control/OD3_QC_R1.fastq.gz in2=OD3/sequence_quality_control/OD3_QC_R2.fastq.gz loose ecct k=62 extend2=50 ihist=OD3/sequence_quality_control/read_stats/QC_insert_size_hist.txt merge=f mininsert0=35 minoverlap0=8 Executing jgi.BBMerge [-Xmx32G, threads=24, in1=OD3/sequence_quality_control/OD3_QC_R1.fastq.gz, in2=OD3/sequence_quality_control/OD3_QC_R2.fastq.gz, loose, ecct, k=62, extend2=50, ihist=OD3/sequence_quality_control/read_stats/QC_insert_size_hist.txt, merge=f, mininsert0=35, minoverlap0=8]

BBMerge version 37.17 Revised arguments: [minoverlap=8, minoverlap0=9, qualiters=4, mismatches=3, margin=2, ratiooffset=0.4, minsecondratio=0.08, maxratio=0.11, ratiomargin=4.7, ratiominoverlapreduction=2, pfilter=0.00002, efilter=8, minentropy=30, minapproxoverlap=30, -Xmx32G, threads=24, in1=OD3/sequence_quality_control/OD3_QC_R1.fastq.gz, in2=OD3/sequence_quality_control/OD3_QC_R2.fastq.gz, ecct, k=62, extend2=50, ihist=OD3/sequence_quality_control/read_stats/QC_insert_size_hist.txt, merge=f, mininsert0=35, minoverlap0=8]

Set threads to 24 Executing assemble.Tadpole2 [in=OD3/sequence_quality_control/OD3_QC_R1.fastq.gz, in2=OD3/sequence_quality_control/OD3_QC_R2.fastq.gz, branchlower=3, branchmult1=20.0, branchmult2=3.0, mincountseed=3, mincountextend=2, minprob=0.5, k=62, prealloc=false, prefilter=0, ecctail=false, eccpincer=false, eccreassemble=true]

Tadpole version 37.17 Using 24 threads. Executing ukmer.KmerTableSetU [in=OD3/sequence_quality_control/OD3_QC_R1.fastq.gz, in2=OD3/sequence_quality_control/OD3_QC_R2.fastq.gz, branchlower=3, branchmult1=20.0, branchmult2=3.0, mincountseed=3, mincountextend=2, minprob=0.5, k=62, prealloc=false, prefilter=0, ecctail=false, eccpincer=false, eccreassemble=true]

Initial: Ways=61, initialSize=128000, prefilter=f, prealloc=f Memory: max=32928m, free=32069m, used=859m

Estimated kmer capacity: 613835151 After table allocation: Memory: max=32928m, free=31725m, used=1203m

java.lang.OutOfMemoryError: Java heap space at ukmer.AbstractKmerTableU.allocLong2D(AbstractKmerTableU.java:218) at ukmer.HashArrayU1D.resize(HashArrayU1D.java:186) at ukmer.HashArrayU1D.incrementAndReturnNumCreated(HashArrayU1D.java:90) at ukmer.HashBufferU.dumpBuffer_inner(HashBufferU.java:196) at ukmer.HashBufferU.dumpBuffer(HashBufferU.java:168) at ukmer.HashBufferU.incrementAndReturnNumCreated(HashBufferU.java:57) at ukmer.KmerTableSetU$LoadThread.addKmersToTable(KmerTableSetU.java:553) at ukmer.KmerTableSetU$LoadThread.run(KmerTableSetU.java:479)

This program ran out of memory. Try increasing the -Xmx flag and setting prealloc.

SilasK commented 6 years ago

The program didn't had enough memory. what was the command, which you used to run it on the cluster?

ghost commented 6 years ago

@SilasK In the config.yaml file, I set threads to 36 and Java_mem to 48. And bsub -n 36 -q par120 -e err.txt -o out.txt atlas assemble --jobs 36 --out-dir results /home/syang/config.yaml --latency-wait 240. In addition, have you finished the pipeline to merge results (e.g. taxonomic or functional profiles) of multiple samples to one table which looks like a normal OTU table? Is that just what qc.snakefile does? Thank you.

ghost commented 6 years ago

@brwnj or @SilasK Error in job convert_sam_to_bam while creating output file OD3/sequence_alignment/OD3.bam. RuleException: AttributeError in line 621 of /home/syang/anaconda3/lib/python3.5/site-packages/atlas/rules/assemble.snakefile: 'Wildcards' object has no attribute 'sample' File "/home/syang/anaconda3/lib/python3.5/site-packages/atlas/rules/assemble.snakefile", line 621, in __rule_convert_sam_to_bam File "/home/syang/anaconda3/lib/python3.5/string.py", line 191, in format File "/home/syang/anaconda3/lib/python3.5/string.py", line 195, in vformat File "/home/syang/anaconda3/lib/python3.5/string.py", line 235, in _vformat File "/home/syang/anaconda3/lib/python3.5/string.py", line 306, in get_field File "/home/syang/anaconda3/lib/python3.5/concurrent/futures/thread.py", line 55, in run Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message

brwnj commented 6 years ago

That fix hasn't been pulled into master yet, but was implemented in f49bd12013e9663784ae0b6264f08420ec030a2a.

SilasK commented 6 years ago

@camel315 for the memory problem on the cluster: request memory from the cluster which is 15% higher than java_mem. could you then send how this work with the bsub command?

ghost commented 6 years ago

Increasing memory by 15% appears to insufficient for workstation, the cluster is still running. In addition, the qc.snake file has another error like: Error File "/home/syang/anaconda3/lib/python3.5/site-packages/atlas/rules/qc.snakefile", line 407, in __rule_calculate_insert_size File "/home/syang/anaconda3/lib/python3.5/concurrent/futures/thread.py", line 55, in run Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message

I checked the tadpole.sh and there are several parameters to avoid running out of memory. After including prealloc=t, prefilter=t, and minprob=0.8 in the corresponding lines in qc.snakefile, the pipeline worked again.

jmtsuji commented 6 years ago

Automated database updates to be phased out in CheckM Going back to the original issue in this thread: I've been having the same problem with initialize_checkm.py. The "failed to connect to server" error appears be a known issue with CheckM's automated database update feature (checkm data update) rather than an issue with ATLAS. I just posted a Github issue on this and got a response: https://github.com/Ecogenomics/CheckM/issues/132. It's recommended to do a manual database download instead of an automated update (and in fact, automated database update functionality has been removed in newer versions of CheckM).

Temporary workaround As a temporary workaround, I've commented out the checkm data update command in initialize_checkm.py, as well as the relevant output files specified in assemble.snakefile (rule initialize_checkm), to skip checking the database, and I have downloaded the most recent CheckM database manually (wget https://data.ace.uq.edu.au/public/CheckM_databases/checkm_data_v1.0.9.tar.gz). This seems to work.

Solution for ATLAS Could the CheckM databases be downloaded as part of atlas download instead of relying on checkm data update?

brwnj commented 6 years ago

I'm looking into it. Any chance someone has time to update the checkm bioconda recipe?

I'm heavily in favor of distributing the downloads via Zenodo because connections to their servers are often very slow.

jmtsuji commented 6 years ago

I could make a pull request for disabling checkm data update in the ATLAS pipeline, if helpful. Updating the CheckM bioconda recipe would not be strictly needed in this case.

Also, I think that adding the CheckM database to Zenodo makes sense. The last database update was in 2015, so the database appears fairly stable.

Most recent version is: https://data.ace.uq.edu.au/public/CheckM_databases/checkm_data_v1.0.9.tar.gz

brwnj commented 6 years ago

I'm testing the changes to the download method now that were implemented in b7773532413f89b575a45eb4e3ba063ef4c49fd0. Afterwards, I'll test the assembly protocol on a clean environment.

brwnj commented 6 years ago

In c8952cd00253c2c8176b7d2e3bf39bf93c639b72, the checkm reference database is now downloaded via atlas download and the subsequent rule initialize_checkm writes logs/checkm_init.txt. If a user has issues with this rule, they can delete logs/checkm_init.txt to re-run the rule. If the user already has everything set up, they will still have to make sure the database files are pre-downloaded before running assembly.

metagenome-atlas / atlas

rule initialize_checkm failed #58