Closed tenzen-y closed 2 weeks ago
Totals | |
---|---|
Change from base Build 11663764609: | 0.0% |
Covered Lines: | 77 |
Relevant Lines: | 77 |
/hold for review
/assign @kubeflow/wg-training-leads
@andreyvelich I addressed all comments. PTAL, thanks!
Thanks @tenzen-y! /lgtm /approve /hold
Feel free to merge it.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: andreyvelich
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Thank you for the review! /hold cancel
What this PR does / why we need it: I implemented the TrainJob condition mechanism based on https://github.com/kubeflow/training-operator/tree/master/docs/proposals/2170-kubeflow-training-v2#state-transition
However, the current implementation depends on the JobSet status.conditions as opposed to the status.terminalState since the terminalState was introduced in JobSet v0.6, then the JobSet depends on the K8s lib. After we upgrade the training-operator dep version to 1.30 in https://github.com/kubeflow/training-operator/pull/2299, we can rely on the termonalState.
So, after we upgrade the K8s libs to 1.30, we can revisit the JobSet status.terminalState.
Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...
format, will close the issue(s) when PR gets merged): Part-of: https://github.com/kubeflow/training-operator/issues/2207 Relates to #2170Checklist: