snakemake / snakemake-executor-plugin-slurm

A Snakemake executor plugin for submitting jobs to a SLURM cluster
MIT License
18 stars 19 forks source link

fix: logfile quoting and scancel error handling #140

Closed johanneskoester closed 2 months ago

johanneskoester commented 2 months ago

Summary by CodeRabbit

coderabbitai[bot] commented 2 months ago

Walkthrough

The changes involve modifications to the run_job and cancel_jobs methods in the SLURM executor plugin. The run_job method now encloses the slurm_logfile variable in single quotes to handle file paths correctly. Additionally, the cancel_jobs method includes new exception handling for subprocess.CalledProcessError, improving error reporting during job cancellation.

Changes

Files Change Summary
snakemake_executor_plugin_slurm/__init__.py - run_job: Enclosed slurm_logfile in single quotes.
- cancel_jobs: Added exception handling for subprocess.CalledProcessError.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant SLURMExecutor
    participant Subprocess

    User->>SLURMExecutor: run_job()
    SLURMExecutor->>Subprocess: Execute SLURM command with 'slurm_logfile'
    Subprocess-->>SLURMExecutor: Return success or error
    SLURMExecutor-->>User: Return job status

    User->>SLURMExecutor: cancel_jobs()
    SLURMExecutor->>Subprocess: Execute scancel command
    alt Error Occurred
        Subprocess-->>SLURMExecutor: Raise CalledProcessError
        SLURMExecutor-->>User: Return error message with exit code
    else Success
        Subprocess-->>SLURMExecutor: Return success
        SLURMExecutor-->>User: Confirm cancellation
    end

Poem

πŸ‡ In the land of code, where jobs do run,
A tweak to the logs brings joy and fun.
Errors now caught, with messages clear,
SLURM's dance grows smoother, let’s all cheer!
Hops of success, let the workflows flow,
With every change, our spirits grow! πŸŽ‰


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share - [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)
Tips ### Chat There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai): - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit testing code for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit testing code for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit testing code.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` - `@coderabbitai help me debug CodeRabbit configuration file.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (Invoked using PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai full review` to do a full review from scratch and review all the files again. - `@coderabbitai summary` to regenerate the summary of the PR. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository. - `@coderabbitai help` to get help. ### Other keywords and placeholders - Add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. - Add `@coderabbitai summary` to generate the high-level summary at a specific location in the PR description. - Add `@coderabbitai` anywhere in the PR title to generate the title automatically. ### Documentation and Community - Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit. - Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback. - Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.
cmeesters commented 2 months ago

will take a look in the late afternoon - right now, I have meeting after meeting.

cmeesters commented 2 months ago

@johanneskoester I tried to read the source code. Didn't help. Consider this:

$ scancel 
scancel: error: No job identification provided
$ echo $?
1
$ sacct -j 16161523 -o state -X
     State 
---------- 
 COMPLETED 
$ scancel  16161523
$  echo $?
0

Which, according to your list, ought to be 2 for the last line. Also, your list states, that exit codes 8 and 0 are identical!

Some of the listed codes do not make sense at all: scancel is there to cancel (obviously) or signal jobs (steps). A job return code cannot indicate that it was requeued and scancel only gives its own exit codes (see main function after line 106) and distinguishes internally between its exit code and job codes. So, if anything, these codes refer to job exit codes, which (except for general ones) are software specific).

BTW I like your PR, might have a look into #136 (the feature works for me, but the tests fail, because an apparent NONE type?).

cmeesters commented 2 months ago

PS black is ok with the length of line 489, the CI is not. That is an issue in itself, don't you think?

after the post about the signals: If we erase considering the error code (because of the questionable purpose), we can delete it and the line shortens.

johanneskoester commented 2 months ago

PS black is ok with the length of line 489, the CI is not. That is an issue in itself, don't you think?

after the post about the signals: If we erase considering the error code (because of the questionable purpose), we can delete it and the line shortens.

Thanks for checking this (I really only stupidly pasted the AI output on the error codes, this again shows how useless this can be ATM). Still, since I observed the error code 8, let us keep it in the exception message for now, maybe it is useful for some people.

cmeesters commented 2 months ago

Gnarf, now the tests fail, because of missing test files. Wonderful. Not something, I will check on a Sunday morning, though. Tomorrow, I will have lectures till noon, only them I might have time to investigate.

johanneskoester commented 2 months ago

Should be fixed now.