Closed ftabaro closed 2 years ago
I would like to ask how to manage log files...can I still get this behavior?
@ftabaro I don't know about for this profile, but in case it could be helpful, I put together an example of how to insert the date into the log file paths with the smk-simple-slurm profile.
Thank you very much @jdblischak! I wasn't really aware I could write multiline bash commands in those YAML config files. This opens a whole new world. I am very thankful for this!
I got the profile running, I just didn't update the issue here, I guess I could close this and maybe open a pull request. In short: I added a question to the Cookiecutter template holding the desired log path, then, at each submission, the submit script checks the folder exists and if not just makes it. I had to fiddle a bit with the arguments conversion and understand how the resources were parsed into the script but it was fun and I actually learnt something.
After templating, Cookiecutter's setting.json
looks like this:
{
"SBATCH_DEFAULTS": "",
"CLUSTER_NAME": "",
"CLUSTER_CONFIG": "",
"ADVANCED_ARGUMENT_CONVERSION": "no",
"LOG_FOLDER": "results/log/slurm"
}
Then I implemented a static method for the CookieCutter object:
@staticmethod
def get_formatted_log_folder() -> str:
""" Get log folder from config and append date"""
log_folder = CookieCutter.LOG_FOLDER
if log_folder != "":
today = date.today().isoformat()
log_folder = os.path.join(log_folder, today)
return log_folder
Which allows me to pull in the formatted path into the slurm-submit.py
script:
LOG_FOLDER = CookieCutter.get_formatted_log_folder()
Then, in the RESOURCE_MAPPING
dict I added:
"output": ("output", ), # the comma prevents coversion of single element tuple to string
"error": ("error", )
And an extra step to verify the log folder exists before job submission:
sbatch_options = slurm_utils.set_slurm_logs(sbatch_options, LOG_FOLDER)
Finally, I implemented the ensure_dirs_exist
and set_slurm_logs
functions in the slurm_utils.py
"module":
def ensure_dirs_exist(path):
"""Ensure output folder for Slurm log files exist."""
di = dirname(path)
if di == "":
return
if not os.path.exists(di):
os.makedirs(di, exist_ok=True)
return
def set_slurm_logs(sbatch_options, log_folder):
for o in ("output", "error"):
if o in sbatch_options:
if log_folder != "":
sbatch_options[o] = os.path.join(log_folder, sbatch_options[o])
else:
sbatch_options[o] = os.path.join(log_folder, "{rule}-%j.log")
logger.debug(f"Setting Slurm {o} stream to: {sbatch_options[o]}")
ensure_dirs_exist(sbatch_options[o])
return sbatch_options
Hi there,
I need a bit of help understanding the new cluster resource allocation.
The
slurm-submit.py
script tries to load the cluster configuration file into thecluster_config
dictionary. However,cluster_config.json
andcluster_config.yaml
have been deprecated and the recommended resource allocation method has been centralized into the profile-specificconfig.yaml
file. Reading Snakemake docs I understand that resource reservations are passed to Snakemake via--default-resources
,--threads
and/or rule-specific directive (threads
andresources
). In the profileconfig.yaml
file, if I ask for a specific resource I see it getting passed to the rule (for instance, if I specify the cluster partition for a given rule). I am still not sure Slurm is picking it up, but at least Snakemake is doing what it is supposed to do. In addition, inspecting thecluster_config
dict it looks like these reservation are never read by the profile code. In other words,slurm-submit.py
never readsconfig.yaml
. Is it a desired behavior?In this scenario, I would like to ask how to manage log files. I used to maintain my own version (fork) of this profile with an extra setting for log folder tree (organized by date and job). Upon execution of the
slurm-submit.py
script, thecluster_config.json
orcluster_config.yaml
was read, the defaults settings extracted and used to build full log files path e.g./var/log/slurm/23-03-2022/some-rule-1234.log
, where/var/log/slurm
was specified in the profilesettings.json
file,23-03-2022
was dynamically computed andsome-rule-1234.log
was derived fromcluster-config.json
(under__default__
). I guess the main question is, can I still get this behavior?