kubeflow / training-operator

Distributed ML Training and Fine-Tuning on Kubernetes
https://www.kubeflow.org/docs/components/training
Apache License 2.0
1.57k stars 682 forks source link

KEP-2170: Implement validations for TrainJob #2209

Open andreyvelich opened 1 month ago

andreyvelich commented 1 month ago

Related: https://github.com/kubeflow/training-operator/issues/2170

We should create validations for the TrainJob using the following tools:

/area webhook

tenzen-y commented 4 weeks ago

/retitle KEP-2170: Implement validations for TrainJob

tenzen-y commented 2 weeks ago

Regarding to CEL validations. We can refer the following Kubernetes blog on how we can implement the CEL validations using kubebuilder markers.

https://kubernetes.io/blog/2022/09/29/enforce-immutability-using-cel/

akshaychitneni commented 4 days ago

/assign