Closed cmeesters closed 3 months ago
[!WARNING]
Rate limit exceeded
@cmeesters has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 43 seconds before requesting another review.
How to resolve this issue?
After the wait time has elapsed, a review can be triggered using the `@coderabbitai review` command as a PR comment. Alternatively, push new commits to this PR. We recommend that you space out your commits to avoid hitting the rate limit.How do rate limits work?
CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our [FAQ](https://coderabbit.ai/docs/faq) for further information.Commits
Files that changed from the base of the PR and between 5a12e61adb55dffce08d7a6ef06cf0800c957e44 and 6f0482c371bcd7209da2c35e615cebaf2385d9bd.
The updates include the addition of a utility function, delete_slurm_environment
, to manage SLURM-related environment variables, enhancing resource management. The warn_on_jobcontext
method in the ExecutorSettings
class has been modified to call this utility function when a warning about running Snakemake in a SLURM context is logged. This aims to ensure a cleaner execution environment for jobs.
Files | Change Summary |
---|---|
snakemake_executor_plugin_slurm/__init__.py snakemake_executor_plugin_slurm/utils.py |
Added the delete_slurm_environment function to unset SLURM-related environment variables and modified warn_on_jobcontext to call this function when logging a warning. |
Objective | Addressed | Explanation |
---|---|---|
Properly handle SLURM job context warnings (#113) | ✅ | |
Ensure environment cleanliness for SLURM jobs (#113) | ✅ | |
Improve Snakemake compatibility when run under SLURM (#113) | ❓ | The changes address warnings but may not fully resolve compatibility issues. |
In the meadow where bunnies hop,
SLURM jobs run without a stop.
With a cleanup dance, we twirl and cheer,
No lingering vars, we have no fear!
Snakemake's path is now so bright,
Hopping along, everything feels right! 🐇✨
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
@fgvieira would you have time to review this PR? Alas, we cannot test this within the CI (due to lack of resources). Seems to work fine for me and others.
Why is it not recommended to run snakemake
as a SLURM job? That is how I've been running it.
Thing is: The executor exports the environment. This has to be done, because otherwise the base environment might not be present. Within a SLURM job the SLURM_*
got exported, too. This caused SLURM in some cases to complain about mem
and mem_per_cpu
mismatches.
I am not sure the current PR will fix all issues. I am even not sure, that 100 % stability can be achieved at all - apparently HPC admins made akward configuration a hobby ... But removing the scheduler env variables certainly is a step forward.
So the idea would be to remove all SLURM_*
env variables from the (new) launched job. Not the parent snakemake
job, right?
In fact, you cannot remove environment variables from a parent shell. Hence, python can only remove within the current environment on a node, not its jobscript. Within the Snakemake process on that node, you would not see the SLURM_*
variables any more. Also, a daughter process of the same Python process would not be able to see them. A newly lauchned job (on the same node or a different one) would get a fresh population of SLURM_*
env variables by SLURM itself.
If you use (host based) logins to that node, you too do not see these environment variables (because: new shell). Howver, if your run
$ srun -A <account> -p <partition> --pty -t 10 bash -i
<node>:<path>$ python3
>>> import os
>>> print(os.environ["SLURM_JOB_ID"])
376803
>>> del os.environ["SLURM_JOB_ID"]
>>> print(os.environ["SLURM_JOB_ID"])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<frozen os>", line 714, in __getitem__
KeyError: 'SLURM_JOB_ID'
>>>
<node>:<path>$ echo $SLURM_JOB_ID
376803
you will see, that you are able to tinker with env vars within your process, not in the parent shell.
might increase stability of in-job submissions, might fix #113
Summary by CodeRabbit
New Features
Bug Fixes