argocd_cluster_connection_status still reporting on old endpoint after modifying cluster settings

I have observed that If the server key of an existing cluster is modified, the application controller assigned to that cluster will report on the old endpoint indefinitely (?) until the application-controller pod is re-created.

To Reproduce

Declare a cluster called foo pointing to server1
Declare an application deploying to cluster foo
Change the server of cluster foo and point it to endpoint server2
Sync the application so it deploys to the new endpoint of cluster foo (server2).

Now there should be two series on the application controller metrics endpoint, reporting on both endpoints (some labels removed):

argocd_cluster_connection_status{
  container="application-controller", 
  endpoint="http-metrics", 
  job="argocd-application-controller-metrics", 
  k8s_version="1.31", 
  namespace="argocd", 
  pod="argo-argocd-application-controller-1", 
  server="https://server1:6443", 
  service="argo-argocd-application-controller-metrics"} 0

argocd_cluster_connection_status{
  container="application-controller", 
  endpoint="http-metrics", 
  job="argocd-application-controller-metrics", 
  k8s_version="1.31", namespace="argocd", 
  pod="argo-argocd-application-controller-1", 
  server="https://server2:6443", 
  service="argo-argocd-application-controller-metrics"} 1

The first sample has 0 as value because in my case that endpoint was immediately disabled after changing Argo CD's configuration.

Expected behavior

I think that no status should be reported about server1 as soon as the endpoint is not known anymore to Argo CD (or at least as soon as it's not needed anymore). Deleting the pod (to force a re-creation) of the offending application controller restores the desired behavior and the metric for server1 is immediately gone from the metrics endpoint. In other words, only information about server2 is reported (as expected, IMO).

The impact of this bug is that monitoring argocd_cluster_connection_status and alarming when any value is 0 might trigger "false positives" as, once the cluster configuration is changed, the old endpoint could be gone at any moment due to the cluster being deleted (see above server1 reporting 0 as value). In other words, observing argocd_cluster_connection_status might not be trustworthy due to this behavior.

Would that metric have been purged without having to restart the controller if I had set --metrics-cache-expiration (disabled by default)? If so, could the documentation be updated to describe that cluster endpoint information is also "cached" and the status reported? Otherwise, please consider this report a bug.

Version

argocd-server: v2.12.6+4dab5bd
  BuildDate: 2024-10-18T17:39:26Z
  GitCommit: 4dab5bd6a60adea12e084ad23519e35b710060a2
  GitTreeState: clean
  GoVersion: go1.22.4
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v5.4.2 2024-05-22T15:19:38Z
  Helm Version: v3.15.2+g1a500d5
  Kubectl Version: v0.29.6
  Jsonnet Version: v0.20.0

argoproj / argo-cd

argocd_cluster_connection_status still reporting on old endpoint after modifying cluster settings #20782