controller should have leader election option when not running webhooks

jdef commented 2 years ago

Describe the solution you'd like I'd like to deploy e.g. the "audit" component statically, not using k8s deployments, in HA. It would be very convenient to leverage the controller-runtime's leader election capability for such. Enabling this via CLI flag would be ideal.

Anything else you would like to add: The program should probably crash if leader election AND webhooks were both enabled since webhooks should not depend on apiserver's leader election in a HA setting.

Environment:

Gatekeeper version: 3.7.0
Kubernetes version: (use kubectl version): 1.21.7

maxsmythe commented 2 years ago

@shomron I think we discussed this at some point. Do you know if that discussion is documented anywhere?

Definitely came up in a community meeting once.

ritazh commented 2 years ago

I think this was discussed https://github.com/open-policy-agent/gatekeeper/discussions/1451#discussioncomment-1062583 (as noted in community call notes) however it seems it was deleted.

maxsmythe commented 2 years ago

Argh, migrated to the discussions project maybe?

sozercan commented 2 years ago

We turned off discussions because of opa/feedback migration, however existing discussions couldn't be transferred. I took a screenshot of it now:

jdef commented 2 years ago

Thanks for digging up that thread. I've already added a couple leader-election flags to my GK fork and will be testing soon. At some point, I'd like to nix the fork though.

maxsmythe commented 2 years ago

Depending on the complexity/behavioral cost, I see no reason not to add it in principle. Curious to see what the PR would look like. I think @shomron was the most skeptical of this.

jdef commented 2 years ago

suggestions for a PR:

two flags (in main.go)
- leader-election (bool, true to enable)
- leader-namespace (ns for leader leases, default to "gatekeeper-system")
ctrl.Options (in main.go)
- LeaderElection: maps to leader-election flag
- LeaderElectionID: gk-leader-12345 (or whatever)
- LeaderElectionNamespace: maps to leader-namespace flag
- LeaderElectionResourceLock: "leases"

I'd love to submit a PR for this; sadly there's a TON of corporate red tape in the way. The above changes yield a fairly small patch that's easy to test and backward compat when leader-election is false.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

jdef commented 2 years ago

/reopen

On Tue, Aug 30, 2022, 7:29 PM stale[bot] @.***> wrote:

Closed #1794 https://github.com/open-policy-agent/gatekeeper/issues/1794 as completed.

— Reply to this email directly, view it on GitHub https://github.com/open-policy-agent/gatekeeper/issues/1794#event-7291632631, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR5KLFXNUSOP4TEX5O6EALV32KLZANCNFSM5MDTNXRA . You are receiving this because you authored the thread.Message ID: @.*** com>

maxsmythe commented 2 years ago

max-bot reopening @sozercan what should users do if they want to re-open a bug?

jdef commented 1 year ago

Not stale

On Tue, Aug 9, 2022, 3:23 PM stale[bot] @.***> wrote:

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

— Reply to this email directly, view it on GitHub https://github.com/open-policy-agent/gatekeeper/issues/1794#issuecomment-1209784472, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR5KLDCA3FGRSFKDTYVHUDVYKV3JANCNFSM5MDTNXRA . You are receiving this because you authored the thread.Message ID: @.***>

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

jdef commented 1 year ago

not stale

ritazh commented 1 year ago

@jdef Would you like to start a design doc and then bring the discussion to one of the community calls to drive this?

jdef commented 1 year ago

this seems like a pretty trivial change, is a design doc really needed? otherwise i'm happy to join a call

On Mon, Dec 19, 2022 at 10:11 AM Rita Zhang @.***> wrote:

@jdef https://github.com/jdef Would you like to start a design doc and then bring the discussion to one of the community calls to drive this?

— Reply to this email directly, view it on GitHub https://github.com/open-policy-agent/gatekeeper/issues/1794#issuecomment-1357815562, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR5KLHUPZLMNDQK2WHFN3DWOB3L7ANCNFSM5MDTNXRA . You are receiving this because you were mentioned.Message ID: @.***>

-- James DeFelice

maxsmythe commented 1 year ago

Not sure if a full-scale design doc is needed, but would be good to know which controllers are in-scope for this (it sounds like singleton controllers only, such as status and audit), the other controllers should be running if we want hot standbys.

Also it'd be good to avoid the "webhooks cannot run next to audit with leader election enabled" edge case. Is it possible to have leader election be selectively enforced?

Of course, this does mean that a pod will self-restart if it loses the leader position, which, if serving a webhook endpoint, might have some interesting impacts to endpoint behavior.

You mentioned this is for pods not running via deployments and in other bugs you mentioned pods running off-cluster. I'm curious what the story for GC-ing stale by-pod status objects is?

Lastly, is there a reason for the leader election namespace to be different from G8r's namespace (by default this is determined via topdown and passed via env variable).

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

chrissnell commented 2 months ago

Adding some discussion at the behest of @ritazh , carrying over from a conversation in Slack.

We are dealing with very busy audit pods on our clusters. These pods chew up 2 CPUs continuously. It would be great if the audit process could somehow be broken into workers that could be scaled horizontally and use less CPU per-worker. This would allow more efficient bin packing and reduce cluster cost (gatekeeper virtually guarantees an additional node per cluster for us).

Splitting audit into workers would also ease node replacement operations during maintenance and pod-node-reorganizing like karpenter does.

Thanks

ritazh commented 2 months ago

Thanks for sharing your use case @chrissnell.

@maxsmythe @sozercan do we want to revisit this proposal?

If anyone is interested in this, please add 👍 and let us know if you want to help with the design and implementation.

chrissnell commented 2 months ago

Here's another issue with monolithic, singleton audit pods: they're labor-intensive to scale as a cluster scales. As a cluster gets bigger, you have to give this pod more resources. Unless you over-allocate it from the beginning, you're going to have to regularly check it to make sure that it's not resource-constrained. Breaking this into worker pods would make scaling effortless.

maxsmythe commented 4 weeks ago

What is the scaling concern with singleton pods for large clusters? What resources become constrained and how does the process degrade once it becomes resource constrained?

Also curious about the CPU usage concern... audit should just take longer with less CPU allocation. Are you seeing some failure behavior with lower CPU allocations?

If it's the RAM usage for caching referential constraints, multiple pods won't help -- each pod will require the same RAM use as a singleton pod.

Otherwise, what is the unit of batching you are proposing?

open-policy-agent / gatekeeper

controller should have leader election option when not running webhooks #1794