AI-Hypercomputer / xpk

xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
Apache License 2.0
83 stars 27 forks source link

Fix duplicate definition of JOBSET_NAME #255

Closed frgossen closed 2 weeks ago

frgossen commented 2 weeks ago

The duplicate definition causes issues with newer versions of kueue failing in something like this:

The JobSet "xyz" is invalid: spec.replicatedJobs[0].template.spec.template.spec.containers[1].env[10]: Duplicate value: map[string]interface {}{"name":"JOBSET_NAME"}

Fixes / Features

-

Testing / Documentation

Testing details.

pawloch00 commented 2 weeks ago

Please switch to branch from local repo, not from fork

frgossen commented 2 weeks ago

Closing this in favour of https://github.com/AI-Hypercomputer/xpk/pull/264