Closed schrej closed 2 years ago
Hi, @schrej Thank you for reporting this issue.
Is this problem occurring to lead to running out of AddressPool frequently in your environment? This hasn't been a case in our environment, because those allocated blocked will be used by Pods that use the same AddressPool created later if this issue occurs.
It lead to this problem at least once, and due to the combination with Kyverno (which relied on the networking) it completely bricked the cluster.
Describe the bug When updating the status of a BlockRequest fails (in our case due to a flaky API server), this can lead to multiple Blocks being created from a single request. In our case this lead to completely filling up all available Pools.
Environments
To Reproduce Tricky, since it's on the error path.
Expected behavior A single BlockRequest should always only yield a single Block.
Suggestion In general, status shouldn't be used as a basis for reconciliation decisions, unless it was computed in the current iteration. Controllers should always look at the actual state instead. I suggest to either derive the Block name from the request in a deterministic way, or add a label to the Block with the UID (or name) of the BlockRequest it was created from. This way the controller can simply query the API for the block it would create, and abort if it already exists. This way it no longer relies on a correct and up to date status of the BlockRequest and is more resilient to errors.