AI-Hypercomputer / xpk

xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
Apache License 2.0
83 stars 27 forks source link

Fix duplicate definition of JOBSET_NAME #264

Closed frgossen closed 2 weeks ago

frgossen commented 2 weeks ago

The duplicate definition causes issues with newer versions of kueue failing in something like this:

The JobSet "xyz" is invalid: spec.replicatedJobs[0].template.spec.template.spec.containers[1].env[10]: Duplicate value: map[string]interface {}{"name":"JOBSET_NAME"}

Fixes / Features

-

Testing / Documentation

Testing details.