BIMSBbioinfo / pigx_rnaseq

Bulk RNA-seq Data Processing, Quality Control, and Downstream Analysis Pipeline
GNU General Public License v3.0
20 stars 11 forks source link

Rule specific memory requirements are not respected in cluster submission #128

Closed borauyar closed 2 months ago

borauyar commented 2 years ago

Rule specific memory requirements defined under settings -> execution -> rules -> [rule name] -> memory is not respected at cluster submission. The job is assigned to a node with the default memory requirements as defined under settings -> execution -> rules -> __default__ -> memory . I ran into this issue because hisat2-build kept failing on the cluster but not on local execution. When I check the allocated resources for the hisat2-build by the cluster (qstat -r), I could see the allocated memory is dependent on the value defined for the __default__ memory. All jobs are allocated the same resources. Somehow, the rule specific resource allocation doesn't work.

Cluster configuration file is okay, it contains all the right information. @rekado do you have any idea why this would be? I looked into the cluster submission settings, but couldn't see anything obvious.

rekado commented 2 years ago

Bora Uyar @.***> writes:

Cluster configuration file is okay, it contains all the right information. @rekado do you have any idea why this would be? I looked into the cluster submission settings, but couldn't see anything obvious.

Hmm.

The runner script (from pigx-common) only invokes qsub with values from the generated cluster configuration file. I suppose it would be good to see what qsub command line is actually produced and executed, so that we can bisect the problem space.

I guess qacct shows you that the default resources have been requested. (Is it only about memory?)

-- Ricardo

borauyar commented 2 years ago

Is it only about memory?

What ever is in the __default__ or other cluster settings (e.g. queue name, stack size etc) are used. In this case, the only thing that matters is that the rule-specific memory is not utilised. It would be an issue also if we wanted to use different queues for different rules.

I suppose it would be good to see what qsub command line is actually produced and executed

Yes, that makes sense. I think we also need to figure out how snakemake utilises these cluster configurations. It seems that this functionality is deprecated (although it should still work) and replaced by profiles. See here. I don't know if we have to switch to that.

rekado commented 2 years ago

Bora Uyar @.***> writes:

I think we also need to figure out how snakemake utilises these cluster configurations.

IIUC it merely makes the config fields available as wildcards. We’re refering to these cluster.* wildcards only within the qsub command that we’re generating.

is deprecated (although it should still work) and replaced by profiles. See here. I don't know if we have to switch to that.

Eventually, yes. Embarrassingly, we’re still on some 5.x version of Snakemake in Guix. Nobody felt it necessary to upgrade… :) So, I think with version 5 it should still work.

I wonder when it broke; or if maybe it never quite worked…?

-- Ricardo

rekado commented 2 years ago

I cannot reproduce this for two reasons:

To test this I changed pigx-rnaseq to run "echo qsub" instead of "qsub", so I can see the actual command it issues. Here's just two rules it executed:

[Sat Apr  2 23:29:38 2022]
rule salmon_index:
    input: /home/rekado/dev/pigx/pigx_rnaseq/tests/sample_data/sample.cdna.fasta, /home/rekado/dev/pigx/pigx_rnaseq/tests/output/input_annotation_stats.tsv
    output: /home/rekado/dev/pigx/pigx_rnaseq/tests/output/salmon_index/pos.bin
    log: /home/rekado/dev/pigx/pigx_rnaseq/tests/output/logs/salmon/salmon_index.log
    jobid: 4
    resources: mem_mb=5000

Submitted job 4 with external jobid 'qsub -v R_LIBS_USER -v R_LIBS -v PIGX_PATH -v GUIX_LOCPATH -q all.q -l h_stack=128M -l h_vmem=5000M -b y -pe smp 8 -cwd -o ./ -e ./ /home/rekado/dev/pigx/pigx_rnaseq/tests/output/.snakemake/tmp.itv32dqj/snakejob.salmon_index.4.sh'.

[Sat Apr  2 23:29:38 2022]
rule hisat2_index:
    input: /home/rekado/dev/pigx/pigx_rnaseq/tests/sample_data/sample.fasta, /home/rekado/dev/pigx/pigx_rnaseq/tests/output/input_annotation_stats.tsv
    output: /home/rekado/dev/pigx/pigx_rnaseq/tests/output/hisat2_index/GRCm38_index.1.ht2l, /home/rekado/dev/pigx/pigx_rnaseq/tests/output/hisat2_index/GRCm38_index.2.ht2l, /home/rekado/dev/pigx/pigx_rnaseq/tests/output/hisat2_index/GRCm38_index.3.ht2l, /home/rekado/dev/pigx/pigx_rnaseq/tests/output/hisat2_index/GRCm38_index.4.ht2l, /home/rekado/dev/pigx/pigx_rnaseq/tests/output/hisat2_index/GRCm38_index.5.ht2l, /home/rekado/dev/pigx/pigx_rnaseq/tests/output/hisat2_index/GRCm38_index.6.ht2l, /home/rekado/dev/pigx/pigx_rnaseq/tests/output/hisat2_index/GRCm38_index.7.ht2l, /home/rekado/dev/pigx/pigx_rnaseq/tests/output/hisat2_index/GRCm38_index.8.ht2l
    log: /home/rekado/dev/pigx/pigx_rnaseq/tests/output/logs/hisat2_index.log
    jobid: 17
    resources: mem_mb=32000

Submitted job 17 with external jobid 'qsub -v R_LIBS_USER -v R_LIBS -v PIGX_PATH -v GUIX_LOCPATH -q all.q -l h_stack=128M -l h_vmem=2000M -b y -pe smp 1 -cwd -o ./ -e ./ /home/rekado/dev/pigx/pigx_rnaseq/tests/output/.snakemake/tmp.itv32dqj/snakejob.hisat2_index.17.sh'.

You can see that the requested memory does indeed differ.

Can you tell me how to reproduce this? I'm guessing that you're not using the Guix environment, so you might be using a more recent Snakemake. I'm using 5.32.2.

rekado commented 2 years ago

I just noticed in the above output that the memory specified in the rule and the h_vmem in the qsub command differ. We probably shouldn't specify different values for local and cluster execution and just derive one from the other.

borauyar commented 2 years ago

No I am using the guix environment, but it might be outdated. My snakemake version is also 5.32.2, but not having threads in all the rules didn't generate an error for me. I will try updating the guix and see how it works.

rekado commented 2 years ago

Hmm, too bad. I was hoping this could be explained by a difference in the snakemake version.

My Guix is very recent, commit 1d62b15dc1f84b4b57cbc423b880a4b4096fda70 from three or four days ago.

Does etc/settings.yaml correspond to etc/settings.yaml.in?

FWIW, I used tests/settings.yaml, and edited it only to add execution:submit-to-cluster:true, so everything else is taken from etc/settings.yaml.

alexg9010 commented 2 months ago

I am facing the same issues. My trimming jobs are failing on the cluster even though per-rule specification is adjusted accordingly. It only works if I increase the default memory limit.

rekado commented 2 months ago

@alexg9010 can you show us the relevant log section that prints the full qsub command in question? Does a memory limit get passed to qsub at all?

alexg9010 commented 2 months ago

@rekado

This is the execution section of the settings file:

[...]
execution:
  submit-to-cluster: yes
  jobs: 16
  nice: 19
  mem_mb: 64000
  cluster:
    missing-file-timeout: 360
    memory: 64000
    stack: 128M
    queue: all.q
    contact-email: none
    log-dir: 'job_logs'
    args: ''
  rules:
    __default__:
      threads: 1
      memory: 2000
    translate_sample_sheet_for_report:
      threads: 1
      memory: 500
    trim_qc_reads:
      threads: 1
      memory: 4000
[...]

Snakemake log:

Details

``` Building DAG of jobs... Using shell: /bin/bash Provided cores: 2 Rules claiming more threads will be scaled down. Job counts: count jobs 1 trim_qc_reads_pe 1 [Sun Mar 31 18:29:58 2024] rule trim_qc_reads_pe: input: /fast/AG_Klussmann/swaroop/mergedReplicates_RNAseq_PDE3Awt_ko_HTNB_rats_small_vessels_2024_01/P3412_RNA_09_NZ544_RNA_01_S48_R1_001.fastq.gz, /fast/AG_Klussmann/swaroop/mergedReplicates_RNAseq_PDE3Awt_ko_HTNB_rats_small_vessels_2024_01/P3412_RNA_09_NZ544_RNA_01_S48_R2_001.fastq.gz output: /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R1.fq.gz, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R2.fq.gz, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/QC/NZ_1.pe.fastp.html, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/QC/NZ_1.pe.fastp.json log: /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/logs/trim_reads.NZ_1.log jobid: 0 wildcards: sample=NZ_1 resources: mem_mb=4000 /bin/bash: line 1: 21688 Killed /gnu/store/ca7xy52qy7vpva1gndki6rzj7zlznqzb-fastp-0.23.2/bin/fastp --adapter_sequence=AGATCGGAAGAGCACACGTCTGAACTCCAGTCA --adapter_sequence_r2=AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT --in1 /fast/AG_Klussmann/swaroop/mergedReplicates_RNAseq_PDE3Awt_ko_HTNB_rats_small_vessels_2024_01/P3412_RNA_09_NZ544_RNA_01_S48_R1_001.fastq.gz --in2 /fast/AG_Klussmann/swaroop/mergedReplicates_RNAseq_PDE3Awt_ko_HTNB_rats_small_vessels_2024_01/P3412_RNA_09_NZ544_RNA_01_S48_R2_001.fastq.gz --out1 /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R1.fq.gz --out2 /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R2.fq.gz -h /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/QC/NZ_1.pe.fastp.html -j /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/QC/NZ_1.pe.fastp.json >> /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/logs/trim_reads.NZ_1.log 2>&1 [Sun Mar 31 18:33:37 2024] Error in rule trim_qc_reads_pe: jobid: 0 output: /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R1.fq.gz, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R2.fq.gz, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/QC/NZ_1.pe.fastp.html, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/QC/NZ_1.pe.fastp.json log: /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/logs/trim_reads.NZ_1.log (check log file(s) for error message) shell: /gnu/store/ca7xy52qy7vpva1gndki6rzj7zlznqzb-fastp-0.23.2/bin/fastp --adapter_sequence=AGATCGGAAGAGCACACGTCTGAACTCCAGTCA --adapter_sequence_r2=AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT --in1 /fast/AG_Klussmann/swaroop/mergedReplicates_RNAseq_PDE3Awt_ko_HTNB_rats_small_vessels_2024_01/P3412_RNA_09_NZ544_RNA_01_S48_R1_001.fastq.gz --in2 /fast/AG_Klussmann/swaroop/mergedReplicates_RNAseq_PDE3Awt_ko_HTNB_rats_small_vessels_2024_01/P3412_RNA_09_NZ544_RNA_01_S48_R2_001.fastq.gz --out1 /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R1.fq.gz --out2 /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R2.fq.gz -h /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/QC/NZ_1.pe.fastp.html -j /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/QC/NZ_1.pe.fastp.json >> /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/logs/trim_reads.NZ_1.log 2>&1 (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!) Removing output files of failed job trim_qc_reads_pe since they might be corrupted: /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R1.fq.gz, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/trimmed_reads/NZ_1.trimmed.R2.fq.gz Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message ```

Cluster Jobs status:

This was the qsub command:

submit_cmd               qsub -v R_LIBS_USER -v R_LIBS -v PIGX_PATH -v GUIX_LOCPATH -q all.q -l h_stack=128M -l h_vmem=2000M -b y -pe smp 1 -cwd -o job_logs/ -e job_logs/ 

The reason the job failed was apparently:

failed                   52  : cgroups enforced memory limit
Details

``` $ qacct -j 7038352 ============================================================== qname all.q hostname max201 group agosdsc_usr owner agosdsc project NONE department akalin jobname snakejob.trim_qc_reads_pe.13.sh jobnumber 7038352 taskid undefined pe_taskid NONE account akalin priority 0 cwd /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb submit_host max040 submit_cmd qsub -v R_LIBS_USER -v R_LIBS -v PIGX_PATH -v GUIX_LOCPATH -q all.q -l h_stack=128M -l h_vmem=2000M -b y -pe smp 1 -cwd -o job_logs/ -e job_logs/ /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/.snakemake/tmp.68ti9tal/snakejob.trim_qc_reads_pe.13.sh qsub_time 03/31/2024 18:29:55.387 start_time 03/31/2024 18:29:55.545 end_time 03/31/2024 18:33:38.123 exec_host_list max201:1 granted_pe smp slots 1 hard_resources h_rt=345600,h_stack=128M,m_mem_free=2000M,normal=true,os=centos7 soft_resources NONE hard_queues NONE soft_queues NONE granted_req. 1,0:h_rt=4:00:00:00,h_stack=128.000M,m_mem_free=1.953G,normal=1,os=centos7 failed 52 : cgroups enforced memory limit deleted_by NONE exit_status 1 ru_wallclock 222.578 ru_utime 427.748 ru_stime 11.019 ru_maxrss 2012784 ru_ixrss 0 ru_ismrss 0 ru_idrss 0 ru_isrss 0 ru_minflt 1632048 ru_majflt 111 ru_nswap 0 ru_inblock 105344 ru_oublock 40 ru_msgsnd 0 ru_msgrcv 0 ru_nsignals 0 ru_nvcsw 3323266 ru_nivcsw 2199779 wallclock 222.650 cpu 438.767 mem 1154.795 io 2.767 iow 0.180 ioops 17648 maxvmem 3.416G maxrss 1.951G maxpss 1.949G arid undefined arname NONE jc_name NONE bound_cores 0,4 resource_map NONE devices NONE gpus NONE gpu_usage NONE failcnt 0 memsw.failcnt 0 max_usage_in_bytes 1.953G memsw.max_usage_in_bytes 1.953G max_cgroups_memory 1.941G hold_jid NONE orig_exec_time -/- exec_time -/- ```

alexg9010 commented 2 months ago

Another example using hisat2_index:

$ pigx-rnaseq -s settings_editted.yaml sample_sheet_merged_replicates_sub.csv --target=hisat2_map --verbose

Resources before job selection: {'_cores': 9223372036854775807, '_nodes': 16}
Ready jobs (1):
        hisat2_index
Selected jobs (1):
        hisat2_index
Resources after job selection: {'_cores': 9223372036854775806, '_nodes': 15}

[Wed Apr  3 16:42:02 2024]
rule hisat2_index:
    input: /fast/AG_Klussmann/swaroop/rat_annotation/genome/Rattus_norvegicus.mRatBN7.2.dna.toplevel.fa, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/input_annotation_stats.tsv
    output: /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.1.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.2.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.3.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.4.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.5.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.6.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.7.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.8.ht2l
    log: /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/logs/hisat2_index.log
    jobid: 2
    resources: mem_mb=32000

Jobscript:
#!/gnu/store/v9p25q9l5nnaixkhpap5rnymmwbhf9rp-bash-minimal-5.1.16/bin/bash
# properties = {"type": "single", "rule": "hisat2_index", "local": false, "input": ["/fast/AG_Klussmann/swaroop/rat_annotation/genome/Rattus_norvegicus.mRatBN7.2.dna.toplevel.fa", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/input_annotation_stats.tsv"], "output": ["/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.1.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.2.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.3.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.4.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.5.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.6.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.7.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.8.ht2l"], "wildcards": {}, "params": {"index_directory": "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index"}, "log": ["/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/logs/hisat2_index.log"], "threads": 1, "resources": {"mem_mb": 32000}, "jobid": 2, "cluster": {"MEM": "2000M", "h_stack": "128M", "nthreads": 1, "queue": "all.q"}}

[...]

 cd /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb && \
PATH='/gnu/store/rw6n86c008xqdbjs3nk4i7ggf6srdpgs-python-wrapper-3.10.7/bin':$PATH /gnu/store/rw6n86c008xqdbjs3nk4i7ggf6srdpgs-python-wrapper-3.10.7/bin/python \
-m snakemake /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.1.ht2l --snakefile /gnu/store/1nwmyp16abzi3yhvk43g0m21plcbgw5g-pigx-rnaseq-0.1.0/libexec/pigx_rnaseq/snakefile.py \
--force -j --keep-target-files --keep-remote --max-inventory-time 0 \
--wait-for-files /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/.snakemake/tmp.hiqwoyvb /fast/AG_Klussmann/swaroop/rat_annotation/genome/Rattus_norvegicus.mRatBN7.2.dna.toplevel.fa /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/input_annotation_stats.tsv --latency-wait 360 \
 --attempt 1 --force-use-threads --scheduler greedy \
--wrapper-prefix https://github.com/snakemake/snakemake-wrappers/raw/ \
--directory /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb --configfiles /fast/AG_Akalin/agosdsc/projects/testing_swaroop/config.json  --allowed-rules hisat2_index --nocolor --notemp --no-hooks --nolock \
--mode 2  && touch /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/.snakemake/tmp.hiqwoyvb/2.jobfinished || (touch /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/.snakemake/tmp.hiqwoyvb/2.jobfailed; exit 1)

Submitted job 2 with external jobid 'Your job 7041965 ("snakejob.hisat2_index.2.sh") has been submitted'.

Cluster log:

$ qstat -j 7041965
==============================================================
job_number:                 7041965
jclass:                     NONE
exec_file:                  job_scripts/7041965
submission_time:            04/03/2024 16:42:02.576
owner:                      agosdsc
uid:                        23377
group:                      agosdsc_usr
gid:                        23377
supplementary group:        AG_Akalin, AG_Tursun, max-users, AG_Akalin_guest, view-login-users, view-users, slbt_coop, galaxyuser, employees-guests-coops, GS_Linux_VM_Access, cfdx, agosdsc_usr
sge_o_home:                 /home/agosdsc
sge_o_log_name:             agosdsc
sge_o_path:                 /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/pigx_work/bin:/fast/AG_Akalin/agosdsc/projects/testing_swaroop/.guix-profile/bin:/gnu/store/a3h6570ajx7mwksq3ymqd7m7nil3qzfv-profile/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/fast/S_Cluster/bin:/usr/local/bin:/fast/service/S_gridengine/8.8.1/bin/lx-amd64:/opt/puppetlabs/bin:/opt/dell/srvadmin/bin:/home/agosdsc/tools/homer/bin/:/home/agosdsc/bin
sge_o_shell:                /bin/bash
sge_o_workdir:              /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb
sge_o_host:                 max-login3
account:                    akalin
cwd:                        /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb
stderr_path_list:           NONE:NONE:job_logs/
hard_resource_list:         h_rt=345600,h_stack=128M,m_mem_free=2000M,normal=true,os=centos7
mail_list:                  agosdsc@max-login3
notify:                     FALSE
job_name:                   snakejob.hisat2_index.2.sh
stdout_path_list:           NONE:NONE:job_logs/
priority:                   0
jobshare:                   0
env_list:                   R_LIBS_USER=/dev/null,R_LIBS=/dev/null,PIGX_PATH=/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/pigx_work/bin:/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/pigx_work/bin:/fast/AG_Akalin/agosdsc/projects/testing_swaroop/.guix-profile/bin:/gnu/store/a3h6570ajx7mwksq3ymqd7m7nil3qzfv-profile/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/fast/S_Cluster/bin:/usr/local/bin:/fast/service/S_gridengine/8.8.1/bin/lx-amd64:/opt/puppetlabs/bin:/opt/dell/srvadmin/bin:/home/agosdsc/tools/homer/bin/:/home/agosdsc/bin,GUIX_LOCPATH=/home/agosdsc/.guix-profile/lib/locale
script_file:                /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/.snakemake/tmp.hiqwoyvb/snakejob.hisat2_index.2.sh
parallel environment:       smp range: 1
department:                 akalin
binding:                    set linear_per_task:1
mbind:                      NONE
submit_cmd:                 qsub -v R_LIBS_USER -v R_LIBS -v PIGX_PATH -v GUIX_LOCPATH -q all.q -l h_stack=128M -l h_vmem=2000M -b y -pe smp 1 -cwd -o job_logs/ -e job_logs/ /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/.snakemake/tmp.hiqwoyvb/snakejob.hisat2_index.2.sh
category_id:                348
request_dispatch_info:      FALSE
start_time            1:    04/03/2024 16:42:03.242
job_state             1:    t
exec_host_list        1:    max089:1
granted_req.          1,0:  m_mem_free=1.953G
usage                 1:    wallclock=00:00:00, cpu=00:00:00, mem=0.00000 GBs, io=0.00000 GB, iow=0.000 s, ioops=0, vmem=N/A, maxvmem=N/A, rss=N/A, maxrss=N/A
binding               1:    max089=0,1
gpu_usage             1:    NONE
cgroups_usage         1:    NONE
scheduling info:            (Scheduler job information only available for jobs requesting dispatch information)

This is the cluster_conf.json:

{
    "__default__": {
        "MEM": "2000M",
        "h_stack": "128M",
        "nthreads": 1,
        "queue": "all.q"
    },
   [...]
       "hisat2": {
        "MEM": "8000M",
        "h_stack": "128M",
        "nthreads": 2,
        "queue": "all.q"
    },
    "hisat2-build": {
        "MEM": "32000M",
        "h_stack": "128M",
        "nthreads": 2,
        "queue": "all.q"
    },
    [...]
}
alexg9010 commented 2 months ago

@rekado

I figured out the issue here! The problem was that the rule names used in the settings file did not match the rule names in the snakemake file:

For example in the settings file we used "hisat2-build" to specify the resources.

https://github.com/BIMSBbioinfo/pigx_rnaseq/blob/bd38e02d499fa93ef44f95aaf3b5c4c8c636a19c/etc/settings.yaml.in#L94-L96

Then in the snakefile we extracted from this index name.

https://github.com/BIMSBbioinfo/pigx_rnaseq/blob/bd38e02d499fa93ef44f95aaf3b5c4c8c636a19c/snakefile.py#L356-L363

However, the problem is that the qsub command for each rule is filled from the cluster_config.json the by matching the rule names. This is why only default settings applied.

By simply using the correct rule name for resource specification in the execution->rules I was able to get the requested resources.

[...]
    trim_qc_reads:
      threads: 3
      memory: 16000
    trim_qc_reads_pe:
      threads: 3
      memory: 16000
    trim_qc_reads_se:
      threads: 3
      memory: 16000
[...]
    hisat2-build:
      threads: 2
      memory: 32000
    hisat2_index:
      threads: 2
      memory: 32000
[...]
[Wed Apr  3 17:04:35 2024]
rule hisat2_index:
    input: /fast/AG_Klussmann/swaroop/rat_annotation/genome/Rattus_norvegicus.mRatBN7.2.dna.toplevel.fa, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/input_annotation_stats.tsv
    output: /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.1.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.2.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.3.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.4.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.5.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.6.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.7.ht2l, /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.8.ht2l
    log: /fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/logs/hisat2_index.log
    jobid: 2
    resources: mem_mb=32000

Jobscript:
#!/gnu/store/v9p25q9l5nnaixkhpap5rnymmwbhf9rp-bash-minimal-5.1.16/bin/bash
# properties = {"type": "single", "rule": "hisat2_index", "local": false, "input": ["/fast/AG_Klussmann/swaroop/rat_annotation/genome/Rattus_norvegicus.mRatBN7.2.dna.toplevel.fa", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/input_annotation_stats.tsv"], "output": ["/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.1.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.2.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.3.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.4.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.5.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.6.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.7.ht2l", "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index/mRatBN7.2_index.8.ht2l"], "wildcards": {}, "params": {"index_directory": "/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/hisat2_index"}, "log": ["/fast/AG_Akalin/agosdsc/projects/testing_swaroop/role_of_pde3a_in_htnb/logs/hisat2_index.log"], "threads": 1, "resources": {"mem_mb": 32000}, "jobid": 2, "cluster": {"MEM": "32000M", "h_stack": "128M", "nthreads": 2, "queue": "all.q"}}