Open tenzen-y opened 1 month ago
/remove-label lifecycle/needs-triage
Could I try upgrading this?
I'd open up a PR with 1.30 first I think following your detailed plan.
Could I try upgrading this?
I'd open up a PR with 1.30 first I think following your detailed plan.
Yes, we can start the v1.30 support before we decide on the scope of the supporting version. /assign @kannon92
I may need some guidance on the code generation.
I tried upgrading this and I ran into some problems with hack/update-codegen.sh
.
# Notice: The code in code-generator does not generate defaulter by default.
# We need to build binary from vendor cmd folder.
#echo "Building defaulter-gen"
#go build -o defaulter-gen ${CODEGEN_PKG}/cmd/defaulter-gen
# $(go env GOPATH)/bin/defaulter-gen is automatically built from ${CODEGEN_PKG}/generate-groups.sh
echo "Generating defaulters for kubeflow.org/v1"
$(go env GOPATH)/bin/defaulter-gen --input-dirs github.com/kubeflow/training-operator/pkg/apis/kubeflow.org/v1 \
-O zz_generated.defaults \
--output-package github.com/kubeflow/training-operator/pkg/apis/kubeflow.org/v1 \
--go-header-file hack/boilerplate/boilerplate.go.txt "$@" \
--output-base "${TEMP_DIR}"
I built the default-gen but it seems that most of these arguments are not in 0.30 anymore.
0.30 does not recognize input-dirs, -O, output-package or output-base.
@tenzen-y let's discuss on https://github.com/kubeflow/training-operator/pull/2299.
I found it difficult to support the existing script as many of that has changed with 0.30. So I created a new script that is very similar to Kueue/JobSet.
Code generation seems to work but I am running into problems with the sdk generation.
@tenzen-y let's discuss on #2299.
I found it difficult to support the existing script as many of that has changed with 0.30. So I created a new script that is very similar to Kueue/JobSet.
Code generation seems to work but I am running into problems with the sdk generation.
You may be able to learn something from the mpi-operator: https://github.com/kubeflow/mpi-operator/pull/657
Yes. That is a great callout.
What you would like to be added?
I would like to support the kubernetes v1.29 - v1.31, and stop the v1.27 and the v1.28 supporting before we release the final training-operator v1 version.
But, based on the v1.28 deprecation date, we may want to support the 4 Kubernetes versions (v1.28 - v1.31). @kubeflow/wg-training-leads WDYT?
What we need to do:
Note that we should upgrade the versions step by step (1.29 -> 1.30 -> 1.31) above tasks so that we can easily revert the commit once we face the any version specific bugs and regressions.
Why is this needed?
Currently, we support the Kubernetes v1.27 - v1.29, but these versions will / have been deprecated: https://kubernetes.io/releases/ So, we should support newer versions.
Love this feature?
Give it a 👍 We prioritize the features with most 👍