Open mickvangelderen opened 2 days ago
ray-operator
created a RayJob with a submitterPodTemplate but no restartPolicy
submitterPodTemplate
restartPolicy
had to search the logs of the ray-operator to find:
{"level":"error","ts":"2024-06-28T18:09:14.679Z","logger":"controllers.RayJob","msg":"failed to create k8s Job","RayJob":{"name":"mick-gxccf","namespace":"launch"},"reconcileID":"3b03831c-d14d-497f-9c8c-4ac790e1ff35","error":"Job.batch \"mick-gxccf\" is invalid: spec.template.spec.restartPolicy: Required value: valid values: \"OnFailure\", \"Never\"","stacktrace":"github.com/ray-project/kuberay/ray-operator/controllers/ray.(*RayJobReconciler).createNewK8sJob\n\t/home/runner/work/kuberay/kuberay/ray-operator/controllers/ray/rayjob_controller.go:440\ngithub.com/ray-project/kuberay/ray-operator/controllers/ray.(*RayJobReconciler).createK8sJobIfNeed\n\t/home/runner/work/kuberay/kuberay/ray-operator/controllers/ray/rayjob_controller.go:350\ngithub.com/ray-project/kuberay/ray-operator/controllers/ray.(*RayJobReconciler).Reconcile\n\t/home/runner/work/kuberay/kuberay/ray-operator/controllers/ray/rayjob_controller.go:168\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227"}
I thought the RayJob spec is supposed to be validated on submission to the API? Is the validation not the same?
"submitterPodTemplate": { "spec": { // "restartPolicy": "Never", <- OFFENDER // ... as usual } }
No response
Search before asking
KubeRay Component
ray-operator
What happened + What you expected to happen
created a RayJob with a
submitterPodTemplate
but norestartPolicy
had to search the logs of the ray-operator to find:
I thought the RayJob spec is supposed to be validated on submission to the API? Is the validation not the same?
Reproduction script
Anything else
No response
Are you willing to submit a PR?