GoogleCloudPlatform / kubeflow-distribution

Blueprints for Deploying Kubeflow on Google Cloud Platform and Anthos
Apache License 2.0
80 stars 63 forks source link

Mpijobs CustomResourceDefinition Version Error #424

Closed oguzhanoyan closed 1 year ago

oguzhanoyan commented 1 year ago

We just installed the required features in kubeflow config.yaml. Then we decided to go through another feature training-operator. But I got the error while I was installing mpijobs:

The CustomResourceDefinition "mpijobs.kubeflow.org" is invalid: status.storedVersions[0]: Invalid value: "v2beta1": must appear in spec.versions

Also, I've got errors in mpi-operator manager pod because of crd installation was not completed successfully. Then went through the file I got an error(apiextensions.k8s.io_v1_customresourcedefinition_mpijobs.kubeflow.org.yaml). Changed .spec.versions.name from v1 to v2beta1 and applied manually(kubectl apply -f apiextensions.k8s.io_v1_customresourcedefinition_mpijobs.kubeflow.org.yaml). This change has fixed errors for bot installation and manager pod.

chensun commented 1 year ago

Hi @oguzhanoyan, if I'm not mistaken, Mpijobs is from https://github.com/kubeflow/training-operator, can you open an issue in that repo instead? Thanks!