Open vadim0x60 opened 1 day ago
The changes involve modifications to the Executor
class in the snakemake_executor_plugin_slurm/__init__.py
file. The updates enhance job submission logic to support GPU resources by checking for gpu
and nvidia_gpu
keys and adjusting the submission command accordingly. Additionally, the logic for setting the number of tasks for SLURM jobs has been updated to ensure compliance with SLURM version 22.05, which requires the --ntasks
option for all submissions. Error handling during job submission has also been improved for better clarity on failures.
File | Change Summary |
---|---|
snakemake_executor_plugin_slurm/__init__.py |
Enhanced job submission logic for GPU resources, updated task settings for SLURM compliance, and refined error handling. |
sequenceDiagram
participant Job as JobExecutorInterface
participant Executor as Executor
participant SLURM as SLURM System
Job->>Executor: Submit Job
Executor->>Executor: Check for GPU resources
alt GPU resources found
Executor->>SLURM: Submit with --gres=gpu:<count>
else No GPU resources
Executor->>SLURM: Submit without GPU
end
SLURM-->>Executor: Job ID
Executor-->>Job: Return Job ID or error
🐇 "In the land of SLURM, where jobs take flight,
With GPUs added, they shine so bright.
Tasks now aligned, with options galore,
Error messages clearer, we can explore!
Hopping through changes, we celebrate cheer,
For the world of computing, we hold dear!" 🌟
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Thank you for this PR. There is some misunderstanding:
--gres
, but I concede, that this is meanwhile almost an issue of the past.slurm_extra
:resources:
slurm_extra="'--gres:gpu:1'"
I will only approve this particular PR, if it becomes a) generic (dropped nvidia specialities) and b) supports a --slurm-...
flag and c) reflects the changes in the docs. As this is easy enough: Shall I do a new PR, or will you refactor yours?
Snakemake supports specification of required GPU resources in the Snakefile, i.e.
Before this patch, slurm executor ignored these specifications and unless the user manually made sure this doesn't happen, the jobs would run on CPU nodes. This is relatively easy to fix, because like Snakemake, SLURM supports per-job GPU resource specification. This patch ensures that GPU requirements from the Snakefile are relayed to SLURM via
Summary by CodeRabbit
New Features
Bug Fixes