Open ipstone opened 3 years ago
Another quick/related newbie question: the memory requirement is per job, or per threads? say if I ask for 8GB as memory limit, will I get 32GB in total for 4 theads, or each thread (among the 4) will only get 2GB? Thanks
hello, the memory limit is per job. So if you ask for 8 GB and you are running 4 threads, if the process altogether is consuming at most 8 GB, it is fine, otherwise it will be killed.
The mem_mb
passed being the default one of the profile is a weird issue, I am guessing your profile is configured to use GB
for LSF_UNIT_FOR_LIMITS
, and then we calculate the memory being 0.032GB for your job, but then round up to 1GB? i.e.:
Could you post the config.yaml
of your profile to help us debugging it?
Thanks @leoisl
Here is my config.yaml, please take a look:
latency-wait: "5"
jobscript: "lsf_jobscript.sh"
use-conda: "True"
use-singularity: "False"
printshellcmds: "True"
restart-times: "0"
jobs: "500"
cluster: "lsf_submit.py"
cluster-status: "lsf_status.py"
max-jobs-per-second: "10"
max-status-checks-per-second: "10"
Here is the CookieCutter.py content:
class CookieCutter:
"""
Cookie Cutter wrapper
"""
@staticmethod
def get_default_threads() -> int:
return int("8")
@staticmethod
def get_default_mem_mb() -> int:
return int("16384")
@staticmethod
def get_log_dir() -> str:
return "logs/cluster"
@staticmethod
def get_default_queue() -> str:
return ""
@staticmethod
def get_lsf_unit_for_limits() -> str:
return "GB"
@staticmethod
def get_unknwn_behaviour() -> str:
return "wait"
@staticmethod
def get_zombi_behaviour() -> str:
return "ignore"
@staticmethod
def get_latency_wait() -> float:
return float("5")
Additionally, I noticed some discrepancies between lsf job description obtained through bjobs -l
vs. the information from the snakemake logs file, for example: bjobs -l gives:
Job <1155966>, Job Name <gen_sigs.genetics_type=genetics_exon>, User <ipstone>, Project
<default>, Application <default>, Status <RUN>, Queue <cpu
queue>, Job Priority <12>, Command </cluster/data/lab/pro
jects/ipstone/genetics_project/.snakemake/tmp.cunieacm/snakejob.
gen_sigs.6.sh>, Share group charged </ipstone>, Esub <memlimi
t>
Wed May 19 18:01:37: Submitted from host <lx01>, CWD </cluster/data/lab/projec
ts/ipstone/genetics_project>, Output File <logs/cluster/gen_sigs
/genetics_type=genetics_exon/jobid6_c5692471-3cd3-4544-8e62-97db
5dd49fbd.out>, Error File <logs/cluster/gen_sigs/genetics_typ
e=genetics_exon/jobid6_c5692471-3cd3-4544-8e62-97db5dd49fbd.e
rr>, 4 Task(s), Requested Resources <select[mem>17] rusage
[mem=17] span[hosts=1]>;
Wed May 19 18:01:38: Started 4 Task(s) on Host(s) <4*lt06>, Allocated 4 Slot(s)
on Host(s) <4*lt06>, Execution Home </home/ipstone>, Executi
on CWD </cluster/data/lab/projects/ipstone/genetics_project>;
Wed May 19 20:23:33: Resource usage collected.
The CPU time used is 33106 seconds.
MEM: 11 Gbytes; SWAP: 0 Gbytes; NTHREAD: 84
PGID: 51801; PIDs: 51801 51802 51804 51805 51810 51811
51845 71958 71959 71960 71961 71962 71963 71967 71969
71970 71974 71975 71979 71983 71987 71988 71998 71999
72000 72001 72002 72006 72013 72014 72015 72016 72018
72022 72023 72024 72028 72029 72030 72031 72035 72036
72037 72041 72042 72043 72047 72048 72049 72053 72061
72068 72075 72085 72098 72109 72119 72132 72139 72143
72151 72161 72209 72214 72218 72219 72226 72229 72234
72235 72236 72237 72245 72252 72265 72273 72279 72283
72290
RUNLIMIT
1440.0 min
MEMLIMIT
17 G
MEMORY USAGE:
MAX MEM: 11 Gbytes; AVG MEM: 10 Gbytes
whereas, in the snakemake logs file (the same job but previously exited as wrong input file name given):
Resource usage summary:
CPU time : 5.23 sec.
Max Memory : -
Average Memory : -
Total Requested Memory : 68.00 GB
Delta Memory : -
Max Swap : -
Max Processes : 6
Max Threads : 14
Run time : 23 sec.
Turnaround time : 23 sec.
It seems the jobs submitted by snakemake did multiple my memory request of 16GB x 4 in the snakemake logs file, but bjobs -l
command shows the memory limit is 16GB (which is stated in your answer). My guess is that there might be a misreporting in the logs.
Thanks a lot for the quick reply!
lastly, here is our LSF version:
IBM Spectrum LSF Standard 10.1.0.10, Jun 23 2020
I was having the same issue, so I just changed lsf_submit.py
to not set the -M
option in this line: https://github.com/Snakemake-Profiles/lsf/blob/34c3c4c462d3a2070643a00033815f30bfd105e0/%7B%7Bcookiecutter.profile_name%7D%7D/lsf_submit.py#L90
I'm not sure what's the point of having that there?
hello, the memory limit is per job. So if you ask for 8 GB and you are running 4 threads, if the process altogether is consuming at most 8 GB, it is fine, otherwise it will be killed.
@leoisl Please be aware that this is not universally true. The LSF system I use definitely interprets the argument as per thread.
Hello,
I am not sure whether there's something wrong in my Snakefile setup: When I use
Is there someway to change the memory requirement for the run, without changing the snakemake-profile? If there's a quick way to edit the profile/to make the change, that would be a valuable solution for me as well (right now using cookie cutter to create a new profile works for me, but it seems a direct editing on the profile, could be quicker/more direct?)
Thanks a lot
Isaac