Closed jonapgar-groupby closed 1 month ago
I think this is a duplicate of #2147?
@jonapgar-groupby thank you for reporting the issue! This is a duplicate of #2147. Track the progress in #2147.
oh weird I didn't see that one after looking into this issue for a few hours! good to know it's being tracked :) thanks!
On Fri, May 17, 2024, 1:52 p.m. Kai-Hsun Chen @.***> wrote:
@jonapgar-groupby https://github.com/jonapgar-groupby thank you for reporting the issue! This is a duplicate of #2147 https://github.com/ray-project/kuberay/issues/2147. Track the progress in #2147 https://github.com/ray-project/kuberay/issues/2147.
— Reply to this email directly, view it on GitHub https://github.com/ray-project/kuberay/issues/2151#issuecomment-2118118896, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6TRL4OE7QYIV2WZZYLGSCDZCY7XFAVCNFSM6AAAAABH2OYN22VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJYGEYTQOBZGY . You are receiving this because you were mentioned.Message ID: @.***>
Search before asking
KubeRay Component
ray-operator
What happened + What you expected to happen
If you specify a
app.kubernetes.io/name
label (with a value other than "kuberay") in yourheadGroupSpec.template.metadata.labels
for a RayJob, the ray-operator will not be able to find the head service, and the cluster will never have its status updated.If you see "unable to find head service" errors in your logs, followed by a loop of "Wait for the RayCluster.Status.State to be ready before submitting the job" messages, it may be a similar error.
The error occurs because any custom
app.kubernetes.io/name
label will be also added to the service, but when ray-operator attempts to locate the service, it uses a filter that always looks forapp.kubernetes.io/name: kuberay
.Reproduction script
Anything else
No response
Are you willing to submit a PR?