Closed SilasK closed 4 years ago
Hi Silas, thanks for the extra explanation! I made a new conda env, 2.3b, and have the cluster config file found. It am trying some settings. Especially the genecatalog was slow with CAT database. Can I just rerun this last part, or which modifications do you have in atlasb2.3 version, to predict either gene functions on assembly or on the mags? Now I did it on the mags, but I have some shallow metagenomes, for which mags are rather incomplete and I just prefer to run genecatalog on the assembly. With this cluster setup, I could try interproscan annotation maybe also?
Hey Sofie,
All changes in Atlas 2.3 should be after the genome dereplication step, so running atlas 2.3 in the same working directory as before should not rerun assembly for example.
Important, double-check your config file with the config file for atlas 2.3. There are some small changes.
Especially in Execution parameters and annotations.
But to be sure you can always make a dry-run first.
I suggest maybe to rename the genome
and Genecatalog
subdirectories.
Changes in 2.2 and 2.3 include: Dropping of CAT and using GTDB-tk and annotation of genes with eggnog-mapper2.
If you set:
genecatalog:
source: contigs
clustermethod: linclust
minlength_nt: 100
minid: 0.95
coverage: 0.9
extra: ""
SubsetSize: 500000
The gene annotation is done on the contigs. All genes are clusters into a genecatalog, and the representatives are annotated with EggNog mapper.
Yes, you can then use the interproscan on the genecatalog
There is also a command-line executable for KOfamscan.
Hi Silas, Thanks a lot for all your effort developping this very usefull pipeline.
I am trying to generate a gene catalog from wastewater metagenomes but I am having difficultes to run metagenome-atlas on a PBS cluster. I followed the instruction to set up the cluster mode, run atlas init
atlas init --db-dir /gpfs1/scratch/db/atlas --working-dir EZ --data-type metagenome --assembler megahit --threads=10 --skip-qc 01_QC
then, everything worked fine running atlas run in dry mode :
atlas run -w EZ -c /gpfs1/scratch/Experiment2/test_gene_catalog/EZ/config.yaml --profile cluster --jobs 4 genecatalog -n
but when I tried to run the pipeline I got the following error for every jobs :
submit command: qsub -N init_pre_assembly_processing -l nodes=1 ppn=4 mem=10gb walltime=3000 /gpfs1/scratch/EZ/Experiment2/test_gene_catalog/EZ/.snakemake/tmp.5_kxsb9d/snakejob.init_pre_assembly_processing.86.sh
Traceback (most recent call last):
File "/home/.config/snakemake/cluster/scheduler.py", line 66, in <module>
raise Exception("Job can't be submitted\n"+output.decode("utf-8")+error.decode("utf-8"))
Exception: Job can't be submitted
usage: qsub [-a date_time] [-A account_string] [-c interval]
[-C directive_prefix] [-e path] [-f ] [-h ] [-I [-X]] [-j oe|eo] [-J X-Y[:Z]]
[-k keep] [-l resource_list] [-m mail_options] [-M user_list]
[-N jobname] [-o path] [-p priority] [-P project] [-q queue] [-r y|n]
[-R o|e|oe] [-S path] [-u user_list] [-W otherattributes=value...]
[-S path] [-u user_list] [-W otherattributes=value...]
[-v variable_list] [-V ] [-z] [script | -- command [arg1 ...]]
qsub --version
Error submitting jobscript (exit code 1):
Thanks in advance for your help.
Well done, I think you did everything right.
Now, each cluster is a bit different. To make a tool that can submit jobs to all clusters without problems is probably impossible, but with some some adjustment it should work.
Are you aware of any constraints of you cluster? Different queues, max number of memory, max number of time? What is the default queue of your cluster system? Do you have some special queue for high memory jobs?
Unfortunately the log is not very helpful, but maybe if you run the following command you get more information, why the sub command failed.
qsub -N init_pre_assembly_processing -l nodes=1 ppn=4 mem=10gb walltime=3000 /gpfs1/scratch/EZ/Experiment2/test_gene_catalog/EZ/.snakemake/tmp.5_kxsb9d/snakejob.init_pre_assembly_processing.86.sh
This command should submit a job, with 4 threads, 10 gb memory for 3000s
Thanks for your quick reply.
There are different queues indeed with some limitations but the std
queue seems not restricted.
Compute (AMD) nodes (190)
normal priority
max 7000 cores
resource allocation parity is determined by group/project
Therefore, I set the ~/.config/snakemake/cluster/cluster_config.yaml specifying the std queue for all the steps.
__default__:
# default parameter for all rules
#queue: std
nodes: 1
# The following rules in atlas need need more time/memory.
# If you need to submit them to different queues you can configure this as outlined.
# run_megahit:
# queue: std
# run_spades:
# queue: std
#gtdb-tk classify uses 'large_mem' and log time
# classify:
# queue: std
# run_checkm_lineage_wf:
# queue: std
# run_all_checkm_lineage_wf:
# queue: std
#You can overwrite values for specific rules
#account: "florentin"
#time: # h
#threads:
Unfortunately, couldn't find the .bash script under snakemake/tmp*/snakejob.init_pre_assembly_processing.86.sh, I guess it is removed soon after.
~/.config/snakemake/cluster/cluster_config.yaml is a yaml file the #
marks the begining of a comment.
__default__:
# default parameter for all rules
queue: std
nodes: 1
account: "florentin"
This should be everything you need to specify. Probably the queue: std
is even not necessary.
Tell me how it goes.
Thanks. I set the the ~/.config/snakemake/cluster/cluster_config.yaml to
__default__:
# default parameter for all rules
queue: std
nodes: 1
account: "florentin"
I am facing a different error now.
Traceback (most recent call last):
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/__init__.py", line 627, in snakemake
batch=batch,
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/workflow.py", line 844, in execute
success = scheduler.schedule()
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/scheduler.py", line 364, in schedule
self.run(job)
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/scheduler.py", line 383, in run
error_callback=self._error,
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/executors.py", line 813, in run
jobscript = self.get_jobscript(job)
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/executors.py", line 629, in get_jobscript
f = job.format_wildcards(self.jobname, cluster=self.cluster_wildcards(job))
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/executors.py", line 701, in cluster_wildcards
return Wildcards(fromdict=self.cluster_params(job))
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/executors.py", line 676, in cluster_params
cluster = self.cluster_config.get("__default__", dict()).copy()
AttributeError: 'NoneType' object has no attribute 'copy'
[2020-01-13 18:55 CRITICAL] Command 'snakemake --snakefile /home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas --jobs 4 --rerun-incomplete --configfile '/gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/config.yaml' --nolock --profile cluster --use-conda --conda-prefix /gpfs1/scratch/florentin/db/atlas/conda_envs genecatalog ' returned non-zero exit status 1.
Can you try to remove the comment line? and if it fails to send me the ~/.config/snakemake/cluster/cluster_config.yaml
as a file in the comment.
Same error again :
atlas run -w EZ_atlas -c /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/config.yaml --profile cluster --jobs 1 genecatalog
[2020-01-13 20:57 INFO] Executing: snakemake --snakefile /home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas --jobs 1 --rerun-incomplete --configfile '/gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/config.yaml' --nolock --profile cluster --use-conda --conda-prefix /gpfs1/scratch/florentin/db/atlas/conda_envs genecatalog
Didn't find raw reads in sampleTable - skip QC
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cluster nodes: 1
Job counts:
count jobs
1 add_eggNOG_header
7 align_reads_to_prefilter_contigs
1 cluster_genes
1 combine_egg_nogg_annotations
1 concat_genes
7 error_correction
7 filter_by_coverage
1 filter_genes
7 finalize_contigs
1 gene_subsets
1 genecatalog
1 get_rep_proteins
7 init_pre_assembly_processing
7 merge_pairs
7 pileup_prefilter
7 predict_genes
7 rename_contigs
1 rename_gene_catalog
7 rename_megahit_output
1 rename_protein_catalog
7 run_megahit
87
[Mon Jan 13 20:57:15 2020]
rule init_pre_assembly_processing:
input: /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/01_QC/ESMetFM09_R1_01M.fastq.gz, /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/01_QC/ESMetFM09_R2_01M.fastq.gz
output: ESMetFM09-01M/assembly/reads/QC_R1.fastq.gz, ESMetFM09-01M/assembly/reads/QC_R2.fastq.gz
log: ESMetFM09-01M/logs/assembly/init.log
jobid: 82
wildcards: sample=ESMetFM09-01M
threads: 4
resources: mem=10, java_mem=8, time=0.5
Traceback (most recent call last):
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/__init__.py", line 627, in snakemake
batch=batch,
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/workflow.py", line 844, in execute
success = scheduler.schedule()
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/scheduler.py", line 364, in schedule
self.run(job)
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/scheduler.py", line 383, in run
error_callback=self._error,
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/executors.py", line 813, in run
jobscript = self.get_jobscript(job)
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/executors.py", line 629, in get_jobscript
f = job.format_wildcards(self.jobname, cluster=self.cluster_wildcards(job))
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/executors.py", line 701, in cluster_wildcards
return Wildcards(fromdict=self.cluster_params(job))
File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/snakemake/executors.py", line 676, in cluster_params
cluster = self.cluster_config.get("__default__", dict()).copy()
AttributeError: 'NoneType' object has no attribute 'copy'
[2020-01-13 20:57 CRITICAL] Command 'snakemake --snakefile /home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas --jobs 1 --rerun-incomplete --configfile '/gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/config.yaml' --nolock --profile cluster --use-conda --conda-prefix /gpfs1/scratch/florentin/db/atlas/conda_envs genecatalog ' returned non-zero exit status 1.
Please find enclosed the yaml file - had to use .txt extension otherwise github was unhappy.
Thanks a ton for your help.
Sorry, could you send me the ~/.config/snakemake/cluster/cluster_config.yaml
my bad, here it is : cluster_config.yaml.txt
The two spaces at the beginning are important:
__default__:
queue: std
nodes: 1
account: "florentin"
The mail clients don't show the code correctly, but you can see the correct version on Github.
Formatting properly the ~/.config/snakemake/cluster/cluster_config.yaml
file did not solve the issue yet.
However, I noticed that the way cpu ressources are configured on the pbs cluster we are using here is different from what is configure for pbs clusters on atlas.
qsub -N init_pre_assembly_processing -q std **-l nodes=1 ppn=4** /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/.snakemake/tmp.85m__d6v/snakejob.init_pre_assembly_processing.83.sh
This command gave me the following error suggesting that I am not giving qsub the proper options :
usage: qsub [-a date_time] [-A account_string] [-c interval]
[-C directive_prefix] [-e path] [-f ] [-h ] [-I [-X]] [-j oe|eo] [-J X-Y[:Z]]
[-k keep] [-l resource_list] [-m mail_options] [-M user_list]
[-N jobname] [-o path] [-p priority] [-P project] [-q queue] [-r y|n]
[-R o|e|oe] [-S path] [-u user_list] [-W otherattributes=value...]
[-S path] [-u user_list] [-W otherattributes=value...]
[-v variable_list] [-V ] [-z] [script | -- command [arg1 ...]]
qsub --version
while, qsub -N init_pre_assembly_processing -q std **-l select=1:ncpus=1** /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/.snakemake/tmp.85m__d6v/snakejob.init_pre_assembly_processing.83.sh
seems to work !
10444.pbs01
ls
01_QC EZ_atlas init_pre_assembly_processing.e10443 init_pre_assembly_processing.o10443
cat *43
Didn't find raw reads in sampleTable - skip QC Building DAG of jobs... Using shell: /bin/bash Provided cores: 64 Rules claiming more threads will be scaled down. Job counts: count jobs 1 init_pre_assembly_processing 1
[Mon Jan 13 23:36:12 2020] rule init_pre_assembly_processing: input: /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/01_QC/ESMetFM37_R1_01M.fastq.gz, /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/01_QC/ESMetFM37_R2_01M.fastq.gz output: ESMetFM37-01M/assembly/reads/QC_R1.fastq.gz, ESMetFM37-01M/assembly/reads/QC_R2.fastq.gz log: ESMetFM37-01M/logs/assembly/init.log jobid: 0 wildcards: sample=ESMetFM37-01M threads: 4 resources: mem=10, java_mem=8, time=0.5
Activating conda environment: /gpfs1/scratch/florentin/db/atlas/conda_envs/70a94580 [Mon Jan 13 23:36:16 2020] Finished job 0. 1 of 1 steps (100%) done
Resource Usage on 2020-01-13 23:36:16.281939: JobId: 10443.pbs01 Project: _pbs_project_default Submission Host: ln-0001.scelse.sg Exit Status: 0 NCPUs Requested: 1 NCPUs Used: 1 Memory Requested: None Memory Used: 0kb Vmem Used: 0kb CPU Time Used: 00:00:13 Walltime requested: None Walltime Used: 00:00:06 Start Time: Mon Jan 13 23:36:09 2020 End Time: Mon Jan 13 23:36:16 2020 Execution Nodes Used: (ca-0003:ncpus=1)
Is there any easy way to modify so it generates qsub commands with -l select=x:ncpus=x
instead of -l nodes=x ppn=x
?
Hi Florentin,
I had the same, to resolve the first error: AttributeError: 'NoneType' object has no attribute 'copy' I removed spaces in the formatting of cluster.config.yaml cluster_config.zip
And then Silas, I did some modifications in the key_mapping.yaml key_mapping.zip : -l before mem -l before walltime and in bold here below I get a space between nodes=:ppn= how to change this in the key_mapping/yaml that it doesn't need a space? I see Florentin has the same issue, -l nodes=1 ppn=4
submit command: qsub -N initialize_qc -l nodes=2 :ppn=4 -l mem=10gb -l walltime =3000 /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakemake/tmp.9yoj5h7 5/snakejob.initialize_qc.186.sh
pbs: command: "qsub" key_mapping: name: "-N {}" account: "-A {}" queue: #"-q {}" nodes: "-l nodes={}" threads: ":ppn={}" mem: "-l mem={}gb" time: "-l walltime={}00" #min= seconds x 100
I think if this issue nodes=:ppn= is solved, we should be able to submit the jobs.
Is there any easy way to modify so it generates qsub commands with -l select=x:ncpus=x instead of -l nodes=x ppn=x ?
Yes, as Sofie points out the key_mapping.yaml
is there for this reason.
I incorporated @Sofie8 changes in the profile. Have a look here and copy-paste the updated part into your key_mapping.yaml
I think if this issue nodes=:ppn= is solved, we should be able to submit the jobs.
I solved the problem by forcing to take always 1 node. I don't know how the tools in atlas would span multiple nodes anyway.
@Sofie8 I'm however not sure if you can supply multiple -l
arguments to the qsub command. As I understand the logs of qsub it should be only one. But test it out.
@Sofie8 and @fconstancias Could I ask you to give me your version of qsub or pbs?
qsub --version
pbs_version = 18.2.3.20181206140456
Could you make atlas submit jobs by performing these changes?
Unfortunatly no, I have now the following error :
... [Tue Jan 14 18:29:57 2020] rule init_pre_assembly_processing: input: /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/01_QC/ESMetFM37_R1_01M.fastq.gz, /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/01_QC/ESMetFM37_R2_01M.fastq.gz output: ESMetFM37-01M/assembly/reads/QC_R1.fastq.gz, ESMetFM37-01M/assembly/reads/QC_R2.fastq.gz log: ESMetFM37-01M/logs/assembly/init.log jobid: 83 wildcards: sample=ESMetFM37-01M threads: 4 resources: mem=10, java_mem=8, time=0.5
Traceback (most recent call last): File "/home/florentin/.config/snakemake/cluster/scheduler.py", line 49, in
command= command_options[system]['command'] KeyError: '{{cookiecutter.cluster_system}}' Error submitting jobscript (exit code 1): Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Note the path to the log file for debugging. Documentation is available at: https://metagenome-atlas.readthedocs.io Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues Complete log: /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/.snakemake/log/2020-01-14T182955.862075.snakemake.log [2020-01-14 18:30 CRITICAL] Command 'snakemake --snakefile /home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/site-packages/atlas/Snakefile --directory /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas --jobs 4 --rerun-incomplete --configfile '/gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/config.yaml' --nolock --profile cluster --use-conda --conda-prefix /gpfs1/scratch/florentin/db/atlas/conda_envs genecatalog ' returned non-zero exit status 1.
Hi @Sofie8, thanks for your input !
@SilasK, please see here the yaml files.
cat ~/.config/snakemake/cluster/cluster_config.yaml
__default__:
queue: std
nodes: 1
cat ~/.config/snakemake/cluster/key_mapping.yaml
# only parameters defined in key_mapping (see below) are passed to the command in the order specified.
system: "{{cookiecutter.cluster_system}}" #check if system is defined below
slurm:
command: "sbatch --parsable"
key_mapping:
name: "--job-name={}"
threads: "-n {}"
mem: "--mem={}g"
account: "--account={}"
queue: "--partition={}"
time: "--time={}"
nodes: "-N {}"
pbs:
command: "qsub"
key_mapping:
name: "-N {}"
account: "-A {}"
queue: "-l partition={}"
threads: "-l nodes=1:ppn={}" # always use 1 node
mem: "-l mem={}gb"
time: "-l walltime={}00" #min= seconds x 100
lsf:
command: "bsub"
key_mapping:
name: "-J {}"
threads: "-n {}"
mem: "-M {}000000"
account: "-P {}"
queue: "-q {}"
time: "-W {}"
nodes: "-C {}"
# for other cluster systems see: https://slurm.schedmd.com/rosetta.pdf
Replace the line system: "{{cookiecutter.cluster_system}}"
in cat ~/.config/snakemake/cluster/key_mapping.yaml
with system: "pbs"
Thanks for your suggestion. Unfortunately it did not solve the issue :
submit command: qsub -N init_pre_assembly_processing -l partition=std -l nodes=1:ppn=4 -l mem=10gb -l walltime=3000 /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/.snakemake/tmp.uzzu21do/snakejob.init_pre_assembly_processing.84.sh Traceback (most recent call last): File "/home/florentin/.config/snakemake/cluster/scheduler.py", line 66, in
raise Exception("Job can't be submitted\n"+output.decode("utf-8")+error.decode("utf-8")) Exception: Job can't be submitted qsub: Cannot set attribute, read only or insufficient permission Resource_List.partition Error submitting jobscript (exit code 1):
Ok, If I understand
qsub: Cannot set attribute, read only or insufficient permission Resource_List.partition
you just don't define any partition. Remove the line from the ~/.config/snakemake/cluster/cluster_config.yaml
Well I think there is an uncompatible parameter in ~/.config/snakemake/cluster/key_mapping.yaml
with my qsub version.
In my case, the queue is specified by -q argument. Hence, I have replaced queue: "-l partition={}"
with queue: "-q {}"
the ~/.config/snakemake/cluster/key_mapping.yaml
Well, it didn't solve everything but I think I am closer than ever :
...
submit command: qsub -N error_correction -q std -l nodes=1:ppn=10 -l mem=60gb -l walltime=30000 /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/.snakemake/tmp.ne1r33ru/snakejob.error_correction.79.sh
Traceback (most recent call last):
File "/home/florentin/.config/snakemake/cluster/scheduler.py", line 75, in <module>
jobid= int(res.strip().split()[-1])
ValueError: invalid literal for int() with base 10: '10487.pbs01'
Some error related to the jobid.
and when I manually submit one of the job created by atlas, then it is working.
qsub -N error_correction -q std -l nodes=1:ppn=10 -l mem=60gb -l walltime=30000 /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/.snakemake/tmp.ne1r33ru/snakejob.error_correction.79.sh
10488.pbs01
Thanks a ton for your help @SilasK
OK,
Seems you have job ids which are not strings but something like 10488.pbs01
If you have a running job, e.g. if you submit a job outside of atlas.
Can you try:
qstat -f -x 10488.pbs01
and
qstat -f -x 10488
I also updated the cluster profile. You can remove the cluster folder and try to make a new one.
It's the cookiecutter ...
instruction.
qsub -- version: Version: 6.1.3
@SilasK I have done the changes in key_mapping.yaml, now the manual job runs, but not when I submit the pbs script.
[Tue Jan 14 22:14:20 2020] rule initialize_qc: input: /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/raw/X5A2_R1.fastq.gz, /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/raw/X5A2_R2.fastq.gz output: X5A2/sequence_quality_control/X5A2_raw_R1.fastq.gz, X5A2/sequence_quality_control/X5A2_raw_R2.fastq.gz log: X5A2/logs/QC/init.log jobid: 198 wildcards: sample=X5A2 priority: 80 threads: 4 resources: mem=10, java_mem=8, time=0.5
submit command: qsub -N initialize_qc -A lp_h_microbe -l nodes=1:ppn=4 -l mem=10gb -l walltime=3000 /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakemake/tmp.8olj0aea/snakejob.initialize_qc.198.sh
Traceback (most recent call last):
File "/ddn1/vol1/site_scratch/leuven/314/vsc31426/newatlas23beta/cluster/scheduler.py", line 66, in
usage: qsub [-a date_time] [-A account_string] [-b secs]
[-c [ none | { enabled | periodic | shutdown |
depth=
Error submitting jobscript (exit code 1):
now the manual job runs, What do you mean with this exactly?
There is no error message why the qsub script fails?
Assuming the jobscript /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakemake/tmp.8olj0aea/snakejob.initialize_qc.198.sh
still exists. Can you to submitt the following variations and see if they work or throw an understandable error message?
qsub -N initialize_qc -A lp_h_microbe -l nodes=1:ppn=4 -l mem=10gb -l walltime=3000 /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakemake/tmp.8olj0aea/snakejob.initialize_qc.198.sh
qsub -N initialize_qc -A lp_h_microbe -l nodes=1:ppn=4 -l mem=10gb -l walltime=3000 -w e /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakemake/tmp.8olj0aea/snakejob.initialize_qc.198.sh
Yes with manual, I meant if I give myself the single qsub job.
So, the first works, second not.
✘ [Jan/15 10:20] vsc31426@tier2-p-login-4 /vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/atlas23beta/atlas $ qsub -N init ialize_qc -A lp_h_microbe -l nodes=1:ppn=4 -l mem=10gb -l walltime=3000 /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakema ke/tmp.1yu6gl4z/snakejob.initialize_qc.186.sh 50160613.tier2-p-moab-2.tier2.hpc.kuleuven.be ✔ [Jan/15 10:21] vsc31426@tier2-p-login-4 /vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/atlas23beta/atlas $ qstat Job ID Name User Time Use S Queue
50160613.tier2-p-moab-2.tier2 initialize_qc vsc31426 0 Q q1h ✔ [Jan/15 10:21] vsc31426@tier2-p-login-4 /vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/atlas23beta/atlas $ qstat Job ID Name User Time Use S Queue
50160613.tier2-p-moab-2.tier2 initialize_qc vsc31426 0 R q1h ✔ [Jan/15 10:22] vsc31426@tier2-p-login-4 /vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/atlas23beta/atlas $ qsub -N init ialize_qc -A lp_h_microbe -l nodes=1:ppn=4 -l mem=10gb -l walltime=3000 -w e /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.sn akemake/tmp.9yoj5h75/snakejob.initialize_qc.186.sh qsub: Requested working directory 'e' is not a valid directory Please specify a valid working directory.
full error log: atlas23beta_setupGM.pbs.zip
But I specify only to submit 5 jobs at the same time, but in the log, I see still it submits more? restart-times: 0 cluster-config: "/ddn1/vol1/site_scratch/leuven/314/vsc31426/newatlas23beta/cluster/cluster_config.yaml" #abs path cluster: "scheduler.py" # cluster-status: "pbs_status.py" # max-jobs-per-second: 1 max-status-checks-per-second: 1 cores: 5 # how many jobs you want to submit to your cluster queue local-cores: 1 rerun-incomplete: true # recomended for cluster submissions keep-going: false
The lines in the scheduler.py where the error occurs:
for key in key_mapping: if key in cluster_param: command+=" " command+=key_mapping[key].format(cluster_param[key])
command+=' {}'.format(jobscript)
eprint("submit command: "+command)
p = Popen(command.split(' '), stdout=PIPE, stderr=PIPE) output, error = p.communicate() if p.returncode != 0: raise Exception("Job can't be submitted\n"+output.decode("utf-8")+error.decode("utf-8")) else: res= output.decode("utf-8")
My pbs script: atlas23beta_setupGM.zip
OK, maybe the
Can you replace the line p = Popen(command.split(' '), stdout=PIPE, stderr=PIPE)
with p = Popen(command), stdout=PIPE, stderr=PIPE)
50160613.tier2-p-moab-2.tier2.hpc.kuleuven.be
is the jobid?
After submitting the command with the the jobid can you run:
qstat -f -x 50160613.tier2-p-moab-2.tier2.hpc.kuleuven.be`` and
qstat -f -x 50160613`
Apparently the 5 cores defined get overwritten by 36. I try to fix that in
3143127 but for now you can run atlas with --jobs 5
so only 5 jobs get submitted at the same time.
I suggest you to set 8 threads in your atlas config.yaml and to adopt your pbs script as follows:
#!/bin/bash -l
#PBS -A lp_h_microbe
#PBS -l nodes=1:ppn=5 # less than 5 threads are needed.
#PBS -l walltime=24:00:00 # or longer
#PBS -l pmem=20gb
#PBS -l partition=std # no need for bigmem is std the standard partion?
#PBS -m ae #what does this stand for?
#PBS -M sofie.thijs@uhasselt.be
module purge
#module load Java/1.8.0_171 # I don't think its used by Atlas
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8
source activate atlas23beta
cd /vsc-hard-mounts/leuven-data/314/vsc31426/miniconda3/envs/atlas23beta/atlas
# to run:
atlas run all -w /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria \
--profile /ddn1/vol1/site_scratch/leuven/314/vsc31426/newatlas23beta/cluster \
--jobs 5
You can run atlas also in a screen
if you have a running job, e.g. if you submit a job outside of atlas.
Can you try: qstat -f -x 10488.pbs01
and
qstat -f -x 10488
Following on your suggestion, I submitted a job generated by atlas but outside of it :
qsub -N merge_pairs -q std -l nodes=1:ppn=10 -l mem=60gb -l walltime=30000 /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/.snakemake/tmp.6aq5vwnd/snakejob.merge_pairs.70.sh
both qstat -f -x with "jobid".pbs01 or "jobid" gave me the same output :
qstat -f -x 10528.pbs01
Job Id: 10528.pbs01 Job_Name = merge_pairs Job_Owner = florentin@ln-0001.scelse.sg resources_used.cpupercent = 672 resources_used.cput = 00:01:22 resources_used.mem = 2908512kb resources_used.ncpus = 10 resources_used.vmem = 62745940kb resources_used.walltime = 00:00:11 job_state = F queue = std server = pbs01 Checkpoint = u ctime = Wed Jan 15 21:13:16 2020 Error_Path = ln-0001.scelse.sg:/gpfs1/scratch/florentin/EZ/Experiment2/test _gene_catalog/merge_pairs.e10528 exec_host = ca-0011/0*10 exec_vnode = (ca-0011:ncpus=10:mem=62914560kb) Hold_Types = n Join_Path = n Keep_Files = n Mail_Points = a mtime = Wed Jan 15 21:13:27 2020 Output_Path = ln-0001.scelse.sg:/gpfs1/scratch/florentin/EZ/Experiment2/tes t_gene_catalog/merge_pairs.o10528 Priority = 0 qtime = Wed Jan 15 21:13:16 2020 Rerunable = True Resource_List.mem = 62914560kb Resource_List.mpiprocs = 10 Resource_List.ncpus = 10 Resource_List.nodect = 1 Resource_List.nodes = 1:ppn=10 Resource_List.place = scatter Resource_List.select = 1:ncpus=10:mem=62914560KB:mpiprocs=10 Resource_List.walltime = 08:20:00 stime = Wed Jan 15 21:13:16 2020 session_id = 125506 jobdir = /home/florentin substate = 92 Variable_List = PBS_O_SYSTEM=Linux,PBS_O_SHELL=/bin/bash, PBS_O_HOME=/home/florentin,PBS_O_LOGNAME=florentin, PBS_O_WORKDIR=/gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalo g,PBS_O_LANG=en_US.utf-8, PBS_O_PATH=/home/florentin/miniconda3/envs/metagenome-atlas/bin:/home/ florentin/miniconda3/condabin:/opt/gcc/6.1.0/bin:/cm/local/apps/environ ment-modules/4.0.0/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbi n:/sbin:/usr/sbin:/cm/local/apps/environment-modules/4.0.0/bin:/usr/lpp /mmfs/bin:/opt/ibutils/bin:/opt/pbs/bin:/home/florentin/.local/bin:/hom e/florentin/bin:/usr/lpp/mmfs/bin:/opt/pbs/bin:/home/florentin/.local/b in:/home/florentin/bin,PBS_O_MAIL=/var/spool/mail/florentin, PBS_O_QUEUE=std,PBS_O_HOST=ln-0001.scelse.sg comment = Job run at Wed Jan 15 at 21:13 on (ca-0011:ncpus=10:mem=62914560k b) and finished etime = Wed Jan 15 21:13:16 2020 run_count = 1 Stageout_status = 1 Exit_status = 0 Submit_arguments = -N merge_pairs -q std -l nodes=1:ppn=10 -l mem=60gb -l w alltime=30000 /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog /EZ_atlas/.snakemake/tmp.6aq5vwnd/snakejob.merge_pairs.70.sh history_timestamp = 1579094007 project = _pbs_project_default
qstat -f -x 10528
Job Id: 10528.pbs01 Job_Name = merge_pairs Job_Owner = florentin@ln-0001.scelse.sg resources_used.cpupercent = 672 resources_used.cput = 00:01:22 resources_used.mem = 2908512kb resources_used.ncpus = 10 resources_used.vmem = 62745940kb resources_used.walltime = 00:00:11 job_state = F queue = std server = pbs01 Checkpoint = u ctime = Wed Jan 15 21:13:16 2020 Error_Path = ln-0001.scelse.sg:/gpfs1/scratch/florentin/EZ/Experiment2/test _gene_catalog/merge_pairs.e10528 exec_host = ca-0011/0*10 exec_vnode = (ca-0011:ncpus=10:mem=62914560kb) Hold_Types = n Join_Path = n Keep_Files = n Mail_Points = a mtime = Wed Jan 15 21:13:27 2020 Output_Path = ln-0001.scelse.sg:/gpfs1/scratch/florentin/EZ/Experiment2/tes t_gene_catalog/merge_pairs.o10528 Priority = 0 qtime = Wed Jan 15 21:13:16 2020 Rerunable = True Resource_List.mem = 62914560kb Resource_List.mpiprocs = 10 Resource_List.ncpus = 10 Resource_List.nodect = 1 Resource_List.nodes = 1:ppn=10 Resource_List.place = scatter Resource_List.select = 1:ncpus=10:mem=62914560KB:mpiprocs=10 Resource_List.walltime = 08:20:00 stime = Wed Jan 15 21:13:16 2020 session_id = 125506 jobdir = /home/florentin substate = 92 Variable_List = PBS_O_SYSTEM=Linux,PBS_O_SHELL=/bin/bash, PBS_O_HOME=/home/florentin,PBS_O_LOGNAME=florentin, PBS_O_WORKDIR=/gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalo g,PBS_O_LANG=en_US.utf-8, PBS_O_PATH=/home/florentin/miniconda3/envs/metagenome-atlas/bin:/home/ florentin/miniconda3/condabin:/opt/gcc/6.1.0/bin:/cm/local/apps/environ ment-modules/4.0.0/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbi n:/sbin:/usr/sbin:/cm/local/apps/environment-modules/4.0.0/bin:/usr/lpp /mmfs/bin:/opt/ibutils/bin:/opt/pbs/bin:/home/florentin/.local/bin:/hom e/florentin/bin:/usr/lpp/mmfs/bin:/opt/pbs/bin:/home/florentin/.local/b in:/home/florentin/bin,PBS_O_MAIL=/var/spool/mail/florentin, PBS_O_QUEUE=std,PBS_O_HOST=ln-0001.scelse.sg comment = Job run at Wed Jan 15 at 21:13 on (ca-0011:ncpus=10:mem=62914560k b) and finished etime = Wed Jan 15 21:13:16 2020 run_count = 1 Stageout_status = 1 Exit_status = 0 Submit_arguments = -N merge_pairs -q std -l nodes=1:ppn=10 -l mem=60gb -l w alltime=30000 /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog /EZ_atlas/.snakemake/tmp.6aq5vwnd/snakejob.merge_pairs.70.sh history_timestamp = 1579094007 project = _pbs_project_default
Hi Silas,
No, I made it worse looks like, now it cannot make the temp files. full error log: atlas23beta_setupGM.pbs.zip
submit command: qsub -N initialize_qc -A lp_h_microbe -l nodes=1:ppn=4 -l mem=10gb -l walltime=3000 /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakemake/tmp.11aobbon/snakejob.initialize_qc.190.sh
Traceback (most recent call last):
File "/ddn1/vol1/site_scratch/leuven/314/vsc31426/newatlas23beta/cluster/scheduler.py", line 63, in
qstat -f -x 50160970
<?xml version="1.0"?>
✔ [Jan/16 01:38] vsc31426@tier2-p-login-4 /vsc-hard-mounts/leuven-data/314/vsc31426/scripts/3.atlas_pipeline $ qstat -f -x 50160970.tier2-p-moab-2.tier2.hpc.kuleuven.be <?xml version="1.0"?>
@fconstancias Thank you for your information.
I updated the clusterprofile -template accordingly. The partition is defined with -q
and I take the number in front of the dot.
Can you submitt jobs now? What if you re-download the clusterprofile?
@Sofie8
Ok, if you change the line to:
p = Popen(command, stdout=PIPE, stderr=PIPE, check=True, shell=True)
This is the same as in the cluster-profile for pbs.
In theory, for testing you can run:
$HOME/.config/snakemake/cluster/scheduler.py /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakemake/tmp.11aobbon/snakejob.initialize_qc.190.sh
@fconstancias Thank you for your information.
I updated the clusterprofile -template accordingly. The partition is defined with
-q
and I take the number in front of the dot.Can you submitt jobs now? What if you re-download the clusterprofile?
Thanks for the update.
I updated the scheduler.py and key_mapping.yaml accordingly.
head scheduler.py key_mapping.yaml
==> scheduler.py <==
#!/usr/bin/env python3
import sys, os
from subprocess import Popen, PIPE
import yaml
import re
def eprint(*args, **kwargs):
print(*args, file=sys.stderr, **kwargs)
==> key_mapping.yaml <==
# only parameters defined in key_mapping (see below) are passed to the command in the order specified.
system: "pbs" #check if system is defined below
slurm:
command: "sbatch --parsable"
key_mapping:
name: "--job-name={}"
threads: "-n {}"
mem: "--mem={}g"
account: "--account={}"
Then, I run atlas init (from my (metagenome-atlas) conda environment).
atlas init --db-dir /gpfs1/scratch/florentin/db/atlas --working-dir EZ_atlas --data-type metagenome --assembler megahit --threads=10 --skip-qc 01_QC
dry run was fine :
atlas run -w EZ_atlas -c /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/config.yaml --profile ~/.config/snakemake/cluster/ --jobs 4 genecatalog -n
so I run atlas run :
atlas run -w EZ_atlas -c /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/config.yaml --profile ~/.config/snakemake/cluster/ --jobs 4 genecatalog
As you can see in the attached log file, now atlas is able to submit jobs ! log.txt
submit command: qsub -N init_pre_assembly_processing -l nodes=1:ppn=4 -l mem=10gb -l walltime=3000 /gpfs1/scratch/florentin/EZ/Experiment2/test_gene_catalog/EZ_atlas/.snakemake/tmp.a7o3xg6m/snakejob.init_pre_assembly_processing.84.sh Submitted job 84 with external jobid '10614'.
but there is an issue :
snakemake.exceptions.WorkflowError: Failed to obtain job status. See above for error message.
qstat confirmed that jobs are submitted properly
qstat
Job id Name User Time Use S Queue
10600.pbs01 init_pre_assemb florentin 00:00:12 R dev
10601.pbs01 init_pre_assemb florentin 00:00:04 R dev
10602.pbs01 init_pre_assemb florentin 0 Q dev
10603.pbs01 init_pre_assemb florentin 0 Q dev
10604.pbs01 init_pre_assemb florentin 0 Q dev
10605.pbs01 init_pre_assemb florentin 0 Q dev
10606.pbs01 init_pre_assemb florentin 0 Q dev
10607.pbs01 init_pre_assemb florentin 0 Q dev
10608.pbs01 init_pre_assemb florentin 0 Q dev
10609.pbs01 init_pre_assemb florentin 0 Q dev
10610.pbs01 init_pre_assemb florentin 0 Q dev
10611.pbs01 init_pre_assemb florentin 0 Q dev
10612.pbs01 init_pre_assemb florentin 0 Q dev
10613.pbs01 init_pre_assemb florentin 0 Q dev
10614.pbs01 init_pre_assemb florentin 0 Q dev
updating the entire ~/.config/snakemake/cluster/
with
cookiecutter --output-dir ~/.config/snakemake https://github.com/metagenome-atlas/clusterprofile.git
gave the same issue.
@fconstancias Great submitting works.
Now the cluster status.
To begin with, this is not optional. You could comment the line cluster_status
in the cluster/config.yaml . Status is important if a job gets killed or finishes early.
On one of your running jobs could you try:
$HOME/.config/snakemake/cluster/cluster_status.py 10600.pbs01
and
$HOME/.config/snakemake/cluster/cluster_status.py 10600
You should get a running
back.
And if you take the id of a finished job. you should get either success
or failed
@SilasK yes that's cool, I am closer than ever.
To begin with, this is not optional. You could comment the line cluster_status in the cluster/config.yaml Are you sure? Running on a toy dataset, I had the feeling that it was stuck because of that error. Or do you mean if I comment it then it is not checking so it should work?
There is no cluster_status.py but the following .yaml .py files
ls ~/.config/snakemake/cluster/
cluster_config.yaml config.yaml key_mapping.yaml lsf_status.sh pbs_status.py scheduler.py slurm_status.py
pbs_status.py was not executable by default.
~/.config/snakemake/cluster/pbs_status.py 10637.pbs01
-bash: /home/florentin/.config/snakemake/cluster/pbs_status.py: Permission denied
chmod +x ~/.config/snakemake/cluster/pbs_status.py
~/.config/snakemake/cluster/pbs_status.py 10637.pbs01
Traceback (most recent call last): File "/home/florentin/.config/snakemake/cluster/pbs_status.py", line 12, in
xmldoc = ET.ElementTree(ET.fromstring(res.stdout.decode())).getroot() File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/xml/etree/ElementTree.py", line 1314, in XML parser.feed(text) xml.etree.ElementTree.ParseError: syntax error: line 1, column 0
~/.config/snakemake/cluster/pbs_status.py 10637
Traceback (most recent call last): File "/home/florentin/.config/snakemake/cluster/pbs_status.py", line 12, in
xmldoc = ET.ElementTree(ET.fromstring(res.stdout.decode())).getroot() File "/home/florentin/miniconda3/envs/metagenome-atlas/lib/python3.6/xml/etree/ElementTree.py", line 1314, in XML parser.feed(text) xml.etree.ElementTree.ParseError: syntax error: line 1, column 0
@fconstancias
For today, you could just comment the line
cluster-status: "pbs_status.py" #
in the cluster/config.yaml
I will try to find a fix.
@SilasK When changing the line: p = Popen(command, stdout=PIPE, stderr=PIPE, check=True, shell=True) It gives check is not valid argument etc. So I changed it to: p = Popen(command, stdout=PIPE, stderr=PIPE, shell=True)
Then it gave error in line 75 of sheduler.py, I tried to understand what it does, its reading my job id, and that is just the ID before the first dot, so I changed: jobid= int(res.strip().split()[-1]) into: jobid= int(res.strip().split('.')[-6])
Running: atlas run all -w /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria --profile /ddn1/vol1/site_scratch/leuven/314/vsc31426/newatlas23beta/cluster --jobs 5
It finally, successfully submits jobs:
rule initialize_qc: input: /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/raw/X5A3_R1.fastq.gz, /ddn1/vo l1/site_scratch/leuven/314/vsc31426/Valeria/raw/X5A3_R2.fastq.gz output: X5A3/sequence_quality_control/X5A3_raw_R1.fastq.gz, X5A3/sequence_quality_control /X5A3_raw_R2.fastq.gz log: X5A3/logs/QC/init.log jobid: 182 wildcards: sample=X5A3 priority: 80 threads: 4 resources: mem=10, java_mem=8, time=0.5
submit command: qsub -N initialize_qc -A lp_h_microbe -l nodes=1:ppn=4 -l mem=10gb -l wall time=3000 /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakemake/tmp.dh22p5o1/snakejo b.initialize_qc.182.sh Submitted job 182 with external jobid '50161380'.
It gave then the same error as you are describing above: pbs_status.py: Permission denied So I gave access: chmod +x ~/.config/snakemake/cluster/pbs_status.py tested: ~/.config/snakemake/cluster/pbs_status.py 50161380 It said success
Now however, it gives error in the next job, Error in rule get_read_stats: jobid: 0 output: X5A1/sequence_quality_control/read_stats/raw.zip, X5A1/sequence_quality_control/read_stats/raw_read_counts.tsv log: X5A1/logs/QC/read_stats/raw.log (check log file(s) for error message)
The log says: /usr/bin/bash: line 2: reformat.sh: command not found
reformat.sh is correctly installed, it just needs to activate my atlasenv: source activate atlas23beta In the previous rule, initialize_gc, it successfully activates Activating conda environment: /ddn1/vol1/site_scratch/leuven/314/vsc31426/db/atlas23beta/conda_envs/b70c4153 but not for get_read_stats, so its executing it outside my atlas23beta env? Do I need to add atlas23beta env bin to my bashrc profile?
Lastly, when I submit the job as pbs script, I get still the error:
submit command: qsub -N get_read_stats -A lp_h_microbe -l nodes=1:ppn=4 -l mem=10gb -l walltime=3000 /ddn1/vol1/site_scratch/leuven/314/vsc31426/Valeria/.snakemake/tmp.cfc8ds4n/snakejob.get_read_stats.59.sh
Traceback (most recent call last):
File "/ddn1/vol1/site_scratch/leuven/314/vsc31426/newatlas23beta/cluster/scheduler.py", line 66, in
@SilasK
Ok, I asked my systems admin, and the system doesn't allow from within a pbs script that snakemake does qsub. So that is the reason for the rejection error.
In screen, from my login-node, yes it works. So the question is if snakemake only submits qsub jobs via my login node, or also does actual calculations at some points (small calculations are allowed on the login node) but not extensive.
So the only thing I need to fix, is that with each qsub job, it knows it has to load the appropriate conda environment?
Well, in the end, I think, maybe for my case, if I have not too much jobs in parallel, but many samples in one job, the pbs script job submission also still worked I guess. I was just trying to see, how I can it run most efficiently.
@Sofie8 Great, you get the submit and status working.
For the problem with the missing reformat.sh
In theory, if you start atlas in the atlas23beta env, then it should find refomat.sh. But you get the error, when submitting the jobscript from inside the atlas23 environment?
check if you have initialized conda correctly. conda activate base; conda init bash
@fconstancias It seems the output of qstat -f -x
is not the same as Sofie's. That's why the status script doesn't work.
Continue the discussion with @sofie8 from #258
This command runs atlas on one node for 72 h, with 36 processors. However, each step uses the full 36 processors. This means one step is executed after the other.
Atlas, thanks to the underlining snakemake has the possibility to submit itself each step as a separate job on your pbs cluster! In other words atlas runs the qsub command.
You can set up, the cluster support following the updated documentation https://metagenome-atlas.readthedocs.io/en/latest/usage/cluster.html
But keep in mind to set the threads to something like