Open randomvariable opened 3 years ago
We probably should discuss in next office hours.
/assign @shivi28
@shivi28 is there any update on this bug?
Hey @sedefsavas, I am going to add debouncing logic and introduce a debouncing window in AWSCluster and AWSMachine reconcilers. Will raise a PR in this week
This issue related to: https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/1764
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
/reopen
@richardcase: Reopened this issue.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
/reopen /triage accepted /priority important-soon
@Ankitasw: Reopened this issue.
The issue has been marked as an important bug and triaged. Such issues are automatically marked as frozen when hitting the rotten state to avoid missing important bugs.
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle frozen
This issue is labeled with priority/important-soon
but has not been updated in over 90 days, and should be re-triaged.
Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.
You can:
/triage accepted
(org members only)/priority important-longterm
or /priority backlog
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
From office hours 2023-04-03:
/triage accepted /priority important-soon
This issue is labeled with priority/important-soon
but has not been updated in over 90 days, and should be re-triaged.
Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.
You can:
/triage accepted
(org members only)/priority important-longterm
or /priority backlog
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
/kind bug
What steps did you take and what happened: [A clear and concise description of what the bug is.]
In a scenario where CAPA is trying to reconcile security groups and fails, this triggers all machines to be requeued, and then causes API rate limiting against the EC2 API.
What did you expect to happen:
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
We deliberately re-enqueue all machines when a AWSCluster is updated in unpaused state in: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/v0.6.4/controllers/awsmachine_controller.go#L239-L240
We wanted to do that so that we have fast reconciliation, however, it's definitely causing the API rate limit to hit quickly. There's a simple reproduction too:
Create a cluster with CAPA, and then change the description of one of the rules, but keep the rules otherwise the same. Or do the following:
Environment:
kubectl version
):/etc/os-release
):/priority important-soon /area networking