argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
18.02k stars 5.5k forks source link

ArgoCD app stuck in Progressing for ingress workloads with no LB IP address #14607

Open caio-eiq opened 1 year ago

caio-eiq commented 1 year ago

Checklist:

Describe the bug

ArgoCD application is stuck in Progressing when there are Ingresses with empty status.loadBalancer.

To Reproduce

  1. Deploy NGINX Controller (nginx inc)
helm repo add nginx-stable https://helm.nginx.com/stable
helm install nginx nginx-stable/nginx-ingress -n nginx-ingress --create-namespace
  1. Create an Ingress manifest:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.org/mergeable-ingress-type: master
  name: ingress-master
  namespace: nginx-ingress
spec:
  ingressClassName: nginx
  rules:
  - host: abc.com

This will give an ingress without a load balancer IP:

$ kubectl get ingress -n nginx-ingress ingress-master        
NAME                 CLASS   HOSTS    ADDRESS   PORTS   AGE
ingress-master   nginx   abc.com                       80      160m
  1. Deploy an ArgoCD application to manage the nginx-controller helm chart
  2. Sync the app, everything will be synced but the status will be in Progress forever

image image

if we inspect the live manifest, the status.loadBalancer is empty and hence is the issue why this is happening? image

Expected behavior

ArgoCD should be able to be healthy when ingresses are deployed without a load balancer IP address.

Version

image

Update: After looking at the code https://github.com/argoproj/gitops-engine/blob/master/pkg/health/health_ingress.go#L7, the Ingress Healtcheck accounts only status.loadBalancer. ingress is > 0 else it will be marked as "progressing". In most of the cases where folks have this issue, the status shows only:

status:
  loadBalancer: {}

and hence we see this issue.

A possible solution would be a support for an annotation (example) argoproj.io/expect-ingresses: "false" which accounts if only the status.loadBalancer is populated.

ebuildy commented 1 year ago

We run same issue with Traefik, because there is no IP reported in ingress object status, we have fixed it by:

https://community.traefik.io/t/traefik-ingress-not-getting-loadbalancer-ip/17445

With nginx ingress, this could help:

https://docs.nginx.com/nginx-ingress-controller/configuration/global-configuration/reporting-resources-status/

rataja commented 1 year ago

Just to leave hint in here. In my case it was caused that two different ingresses led to same url. So the old one (different namespace) was working but new one couldnt be assigned. I deleted old one and app in argocd immediatelly become healthy

uluzox commented 1 year ago

Same issue here with Azure Application Gateway Ingress Controller for Azure Kubernetes Service.

ebuildy commented 1 year ago

Azure Application Gateway Ingress Controlle

Can you open a ticket on the github project please?

It should be already done, according code source here https://github.com/Azure/application-gateway-kubernetes-ingress/blob/master/pkg/k8scontext/context.go#L759

unc (c *Context) updateV1IngressStatus(ingressToUpdate networking.Ingress, newIP IPAddress) error {
    ingressClient := c.kubeClient.NetworkingV1().Ingresses(ingressToUpdate.Namespace)
    ingress, err := ingressClient.Get(context.TODO(), ingressToUpdate.Name, metav1.GetOptions{})
    if err != nil {
        e := controllererrors.NewErrorWithInnerErrorf(
            controllererrors.ErrorUpdatingIngressStatus,
            err,
            "Unable to get ingress %s/%s", ingressToUpdate.Namespace, ingressToUpdate.Name,
        )
        c.MetricStore.IncErrorCount(e.Code)
        return e
    }

    for _, lbi := range ingress.Status.LoadBalancer.Ingress {
        existingIP := lbi.IP
        if existingIP == string(newIP) {
            klog.Infof("IP %s already set on Ingress %s/%s", lbi.IP, ingress.Namespace, ingress.Name)
            return nil
        }
    }

    loadBalancerIngresses := []v1.LoadBalancerIngress{}
    if newIP != "" {
        loadBalancerIngresses = append(loadBalancerIngresses, v1.LoadBalancerIngress{
            IP: string(newIP),
        })
    }
    ingress.Status.LoadBalancer.Ingress = loadBalancerIngresses

    if _, err := ingressClient.UpdateStatus(context.TODO(), ingress, metav1.UpdateOptions{}); err != nil {
        e := controllererrors.NewErrorWithInnerErrorf(
            controllererrors.ErrorUpdatingIngressStatus,
            err,
            "Unable to update ingress %s/%s status", ingress.Namespace, ingress.Name,
        )
        c.MetricStore.IncErrorCount(e.Code)
        return e
    }

    return nil
}
cerhades commented 1 year ago

i was able to fix this issue on my setup. i used the traefik helm chart for installation and i had to enable publishedService. see the following snip from my traefik helm values file. once i enabled that, argocd apps now show healthy.

providers: kubernetesCRD: allowCrossNamespace: true allowExternalNameServices: true kubernetesIngress: allowExternalNameServices: true publishedService: enabled: true

Hazmi35 commented 10 months ago

i was able to fix this issue on my setup. i used the traefik helm chart for installation and i had to enable publishedService. see the following snip from my traefik helm values file. once i enabled that, argocd apps now show healthy.

providers: kubernetesCRD: allowCrossNamespace: true allowExternalNameServices: true kubernetesIngress: allowExternalNameServices: true publishedService: enabled: true

Hi, I have some question. Do you use LoadBalancer as the Traefik service type? If yes do you know any workarounds for this issue if I'm using ClusterIP?

I have a setup where I used ClusterIP for Traefik Service because I don't expose Traefik directly but through cloudflared tunnel in another pod

autokilla47 commented 9 months ago

Same problem. kubernetes: v1.28.6 argo-cd: v2.9.5+f943664

woojh3690 commented 9 months ago

Same problem. kubernetes: v1.28.6 argo-cd: v2.10.1+a79e0ea

maetthu commented 9 months ago

I was facing the same issue with a ClusterIP traefik behind an externally provisioned AWS loadbalancer bound to the traefik service using TargetGroupBinding. Workaround which worked for me is to set the

--providers.kubernetesingress.ingressendpoint.hostname=ingress.example.org

argument to the loadbalancer FQDN, the status gets populated by traefik with

status:
  loadBalancer:
    ingress:
      - hostname: >-
          ingress.example.org

and ArgoCD is happy.

MarkhamLee commented 4 months ago

i was able to fix this issue on my setup. i used the traefik helm chart for installation and i had to enable publishedService. see the following snip from my traefik helm values file. once i enabled that, argocd apps now show healthy.

providers: kubernetesCRD: allowCrossNamespace: true allowExternalNameServices: true kubernetesIngress: allowExternalNameServices: true publishedService: enabled: true

Thank you so much, this has been bugging me for a couple of days now and your solution worked perfectly. Thanks again!

shamil commented 3 months ago

We are facing the same issue. We have ingress that doesn't create an IP address, it's just use for routinig purpose only. And because of that ArgoCD keep the app unsynced (always in progress).

@alexmt Is this planned to fix. Any ideas, workaround.

lq-natemertins commented 3 months ago

Also struggling with this trying to use ArgoCD to deploy Anypoint Runtime Fabric. Namely provisioning the ingress template resource. This is a pretty unique case but I like the idea of an annotation that could act as a short circuit for the ingress healthcheck.

didlawowo commented 2 months ago

got the same problem with ngrok

nodox commented 1 month ago
# values.yaml
providers:
  kubernetesCRD:
    allowCrossNamespace: true
    allowExternalNameServices: true

  kubernetesIngress:
    allowExternalNameServices: true
    publishedService:
      enabled: true

Format of helm chart that worked for me.

MauroSoli commented 3 weeks ago

In my env, the problem has been resolved without the use of allowCrossNamespace:

traefik/traefik    33.0.0    v3.2.0    A Traefik based Kubernetes ingress controller
providers:
  kubernetesCRD:
    # -- Allows to reference ExternalName services in IngressRoute
    allowExternalNameServices: true
  kubernetesIngress:
    # -- Allows to reference ExternalName services in Ingress
    allowExternalNameServices: true
    publishedService:
      enabled: true
andrii-korotkov-verkada commented 2 days ago

Sounds like this can be resolved by other means in a number of cases. Should the healthcheck be updated to show unhealthy in these situations?