argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.33k stars 5.26k forks source link

Concurrency Issue with Slugify Function #18369

Open mhotan opened 3 months ago

mhotan commented 3 months ago

Checklist:

Describe the bug

The Slugify function is currently not safe when accessed concurrently from multiple goroutines. Uncontrolled access to shared global state can cause data races where Slugify returns inconsistent results with the same input. This is a fairly serious problem for my organization as we utilize ApplicationSets (app of apps pattern) and intentionally don't self-heal. This leads to application sets utilizing slugify to render different application names when there is enough activity where different goroutines utilize SlugifyName. Subsequently, applications are destroyed, recreated, and left in an OutOfSync state.

In our situation, we enabled preserveResourcesOnDeletion for AppSets that use slugify. This prevents deleting underlying K8s resources, but the application and app set recreation are not ideal, as we have an alert condition when they are out of sync.

To Reproduce

It is difficult to reproduce organically. One can create an appset + generator that utilizes Slugify with different max lengths and monitors Kubernetes audit log deletion and recreation.

Running the test in this PR: https://github.com/argoproj/argo-cd/pull/18370 without including the fix would demonstrate the problem.

Expected behavior

SlugifyName with the same input parameters should be idempotent regardless of concurrent usage.

Screenshots

Slugify_Concurrency_issue

Version

❯ argocd version
argocd: v2.11.0+d3f33c0
  BuildDate: 2024-05-07T18:31:19Z
  GitCommit: d3f33c00197e7f1d16f2a73ce1aeced464b07175
  GitTreeState: clean
  GoVersion: go1.22.2
  Compiler: gc
  Platform: darwin/arm64
argocd-server: v2.11.0+d3f33c0
  BuildDate: 2024-05-07T16:01:41Z
  GitCommit: d3f33c00197e7f1d16f2a73ce1aeced464b07175
  GitTreeState: clean
  GoVersion: go1.21.9
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v5.2.1 2023-10-19T20:13:51Z
  Helm Version: v3.14.3+gf03cc04
  Kubectl Version: v0.26.11
  Jsonnet Version: v0.20.0

Logs

mhotan commented 3 months ago

There is a four year old open PR: https://github.com/gosimple/slug/pull/51 to make the underlying Go library that would be applicable.

agaudreault commented 2 months ago

@mhotan Can you add an example of applicationSet using the slugify function that is causing the issue. I think that would greatly help to reproduce.