Closed mimowo closed 3 days ago
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: mimowo
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Name | Link |
---|---|
Latest commit | 52a05e33dfffbb8e88482dab148c4c02ce2da5b4 |
Latest deploy log | https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/6740312146780900085f78f6 |
Deploy Preview | https://deploy-preview-3612--kubernetes-sigs-kueue.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
cc @mwysokin - thank you for testing and nailing down the tricky scenario! cc @PBundyra @mbobrovskyi
/lgtm Thanks!
LGTM label has been added.
/cherry-pick release-0.9
@mbobrovskyi: once the present PR merges, I will cherry-pick it on top of release-0.9
in a new PR and assign it to you.
@mbobrovskyi: new pull request created: #3613
What type of PR is this?
/kind bug
What this PR does / why we need it:
To fix the bug where Kueue drops reconcile requests for the non-leading replica, due to the handling here. The reconciled type for the object was passed wrong (ClusterQueue instead of ResourceFlavor), and thus this was returning NotFound, and thus be ignored here.
As a consequence, after a rolling restart, the previously non-leading replica would not perform Reconcile when becoming the leading replica (and the events where lost). Users could observe this as workloads stuck with the following message:
couldn''t assign flavors to pod set main: Flavor "tas-flavor" information missing in TAS cache, Workload requires Topology, but there is no TAS cache information
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?