argoproj / gitops-engine

Democratizing GitOps
https://pkg.go.dev/github.com/argoproj/gitops-engine?tab=subdirectories
Apache License 2.0
1.67k stars 251 forks source link

feat: Drop unnecessary listing for the sake of watch reinitialization #616

Open tosi3k opened 1 month ago

tosi3k commented 1 month ago

This change addresses the performance issue existing in the cluster cache described in https://github.com/argoproj/argo-cd/issues/18838.

kube-apiserver logs for the Pods resource (supposed super-low latency logged for the WATCH requests is due to a bug in Kubernetes: https://github.com/kubernetes/kubernetes/issues/125614):

INFO 2024-07-23T05:26:40.330372Z "HTTP" verb="LIST" URI="/api/v1/pods?limit=500&resourceVersion=0" latency="95.186516ms" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="976ba7bb-40cc-4c2b-9739-e15c5e78415f" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_execution_time="94.722114ms" resp=200
INFO 2024-07-23T05:26:40.516139Z "HTTP" verb="WATCH" URI="/api/v1/pods?allowWatchBookmarks=true&resourceVersion=10636&timeoutSeconds=600&watch=true" latency="1.212642ms" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="26593f98-914a-4711-8eea-db9f284a8520" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_init_latency="529.67µs" apf_execution_time="533.174µs" resp=0
INFO 2024-07-23T05:36:40.518449Z "HTTP" verb="WATCH" URI="/api/v1/pods?allowWatchBookmarks=true&resourceVersion=17070&timeoutSeconds=600&watch=true" latency="1.104058ms" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="6e495250-eb2b-4af4-bb09-d999197d7e73" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_init_latency="531.187µs" apf_execution_time="532.709µs" resp=0
INFO 2024-07-23T05:46:40.522146Z "HTTP" verb="WATCH" URI="/api/v1/pods?allowWatchBookmarks=true&resourceVersion=23542&timeoutSeconds=600&watch=true" latency="988.866µs" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="05329b6e-8bea-44c0-b22a-42b7ce65b796" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_init_latency="499.414µs" apf_execution_time="500.85µs" resp=0
INFO 2024-07-23T05:56:40.524950Z "HTTP" verb="WATCH" URI="/api/v1/pods?allowWatchBookmarks=true&resourceVersion=29971&timeoutSeconds=600&watch=true" latency="995.693µs" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="fe66a06c-1329-4739-90f0-f0cdd4dd821d" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_init_latency="455.428µs" apf_execution_time="456.954µs" resp=0
tosi3k commented 1 month ago

@crenshaw-dev thanks for the review - I'll respond to the comments here later on.

FWIW as agreed offline during yesterday's sync, I split the fix into two PRs - this one would just drop unnecessary listing after watch expiry and https://github.com/argoproj/gitops-engine/pull/617 would make the list API calls target the watch cache instead of etcd.

I guess we can proceed with the latter one.

sonarcloud[bot] commented 1 month ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

wojtek-t commented 1 month ago

This LGTM from k8s perspective.