open-policy-agent / gatekeeper

🐊 Gatekeeper - Policy Controller for Kubernetes
https://open-policy-agent.github.io/gatekeeper/
Apache License 2.0
3.55k stars 729 forks source link

validation latencies capped at 3 secs even though validatingWebhookTimeoutSeconds set at 5 #3415

Open pankajmt opened 3 weeks ago

pankajmt commented 3 weeks ago

What steps did you take and what happened: [A clear and concise description of what the bug is.]

We are running load test on a policy making an external_data call to a service which is in another region and hence has a 100+ msec latency. validatingWebhookTimeoutSeconds is set to 5. Still, on our load tests, I see validation latencies capped at 3 secs and denied admission: eval_cancel_error: caller cancelled query execution errors. Any ideas why?

I know if we upgrade to 3.13, we can enable provider caching.

What did you expect to happen:

Cap at 5 sec and then start throwing eval_cancel_error.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

pankajmt commented 3 weeks ago

I am also looking to try max-serving-threads set to something like 30 when cpu requests is set to 3.

We do not set limits in reference to this section in the docs - Gatekeeper uses automaxprocs to default this value to the CPU limit set by the linux cgroup (i.e. the limits passed to the Kubernetes container).