ray-project / kuberay

A toolkit to run Ray applications on Kubernetes
Apache License 2.0
963 stars 328 forks source link

[Feature] Allow setting `ttlSecondsAfterFinished` for RayJob submitter Job #2187

Closed mickvangelderen closed 2 weeks ago

mickvangelderen commented 2 weeks ago

Search before asking

Description

The RayJob cluster can be cleaned up by setting spec.shutdownAfterJobFinishes: true, but I can not figure out how to clean up the RayJob itself including the submitter k8s Job.

I am not familiar with Go, but browsing the source code a bit, it seems like this can be realized in createNewK8sJob. The backoffLimit can be set through submitterConfig, perhaps the shutdownAfterJobFinishes field can just be added to that type and used in createNewK8sJob? That would not delete the RayJob resource though.

Use case

I want the RayJob cluster, the RayJob and the submitter k8s Job to be cleaned up automatically.

Related issues

No response

Are you willing to submit a PR?

andrewsykim commented 2 weeks ago

FYI there's a PR for this already https://github.com/ray-project/kuberay/pull/2097

mickvangelderen commented 2 weeks ago

Dang, my bad. Thanks. Closing as duplicate of https://github.com/ray-project/kuberay/issues/1944.