ray-project / kuberay

A toolkit to run Ray applications on Kubernetes
Apache License 2.0
963 stars 328 forks source link

Remove failed reason if reconciliation was successful #2194

Closed Ygnas closed 1 week ago

Ygnas commented 2 weeks ago

Why are these changes needed?

At the moment reason is set only on error, and is not removed even if the error is fixed.

This should clear the reason if error does not exist

My issue was another operator adding the ServiceAccount to the RayCluster and creating the said ServiceAccount a moment later but if KubeRay reconcile in between the reason about forbidden serviceaccount will be added and never removed even though the service account was created almost at the same time and does exist.

Related issue number

Checks

kevin85421 commented 1 week ago

We are currently working on redefining the RayCluster status, so I may not merge this PR at the moment. I will send you the WIP doc. We can work together if you are interested in it.