openshift / service-serving-cert-signer

Archiving in favor of https://github.com/openshift/service-ca-operator
Apache License 2.0
13 stars 18 forks source link

High API server request rates #41

Open sjenning opened 5 years ago

sjenning commented 5 years ago

@ericavonb opening this to track the improvement effort

Operator is making 13-14 requests/s against the API server. 3-4 of those are PUTs i.e. mutating which can cause a lot potentially unneeded etcd and watcher activity.

query sort_desc(sum by (client) (rate(apiserver_request_count[10m])))

{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format"}  13.570370370370371

query sort_desc(sum without (instance,code,type,contentType,job,endpoint,scope,service) (rate(apiserver_request_count{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",verb=~"GET|PUT|POST|UPDATE"}[10m])))

{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="configmaps",verb="GET"} 3.3807017543859645
{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="configmaps",verb="PUT"} 1.9929824561403506
{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="services",verb="PUT"}   0.8543859649122807
{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="clusterroles",verb="GET"}   0.8333333333333334
{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="services",verb="GET"}   0.8333333333333334
{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="namespaces",verb="GET"} 0.8333333333333334
{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="clusterrolebindings",verb="GET"}    0.8333333333333334
{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="serviceaccounts",verb="GET"}    0.8333333333333334
{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="clusterrolebindings",verb="PUT"}    0.8333333333333334
{client="service-serving-cert-signer/v0.0.0 (linux/amd64) kubernetes/$Format",namespace="default",resource="deployments",verb="GET"}    0.831578947368421
sjenning commented 5 years ago

I'm not sure what "fixed" is for this issue, but if an investigation could be done on where the activity is coming from and if it is justified, that would be good.

sjenning commented 5 years ago

/assign @mrogers950

smarterclayton commented 5 years ago

Most of all controllers should have 0 write rate on the api if no one is modifying their core objects. no one should be writing except on external edges. no one should have writes that only change a timestamp, etc.

mrogers950 commented 5 years ago

@sjenning @smarterclayton we've got some issues with our operator sync loop causing this, that @enj and I are trying to iron out.