Open GuillaumeDesforges opened 1 year ago
@GuillaumeDesforges How do you run those jobs? Do you use job API to submit those jobs to the cluster or use other means (run a script on a node directly or via ray client)?
I use JobSubmissionClient
in a script from my local machine to submit to a remote ray cluster.
Got it. Thanks! One question:
stop
method for stopping a job programmatically in the Jobs API. Why do you want to an endpoint in dashboard API for killing a job?In terms of "a button in the "Jobs" tab in the dashboard UI to kill a job", we've heard similar requests before. I've added it into our backlog.
Thanks, indeed I missed on .stop_job
.
However a stop job endpoint exposed via the web API could be helpful for interoperability with other tools (e.g. command that pipes to xargs curl).
cc: @alanwguo @rkooo567 @rickyyx @architkulkarni for awareness
Hi, I'm a bot from the Ray team :)
To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.
If there is no further activity in the 14 days, the issue will be closed!
You can always ask for help on our discussion forum or Ray's public slack channel.
Keep it open. It's still in the backlog.
It would be really helpful to have this feature!
Is there a timeline to deliver this feature?
No timeline yet. The team is pretty overloaded at the moment. Contribution is welcome.
Any timelines on this yet?
Similarly, is it expected that the UI should have a STOP button for individual tasks as well?
We plan to first add the job stop to dashboard.
Description
To my knowledge, there is no way to interupt a job, neither from the dashboard REST API nor the dashboard UI.
It would be helpful to have
Use case
A long-running job has been updated and needs to be restarted, so I stop the running job and re-submit with newer code/config.