argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
16.84k stars 5.1k forks source link

Deploying from a helm repo to ArgoCD in a proxy environment results in timeout as Argo adds the stable repo by default #2826

Open afanrasool opened 4 years ago

afanrasool commented 4 years ago

Checklist:

Describe the bug

Currently in a proxy environment, when I try to deploy an app from a private helm repo, ArgoCD times out and fails. Following are my findings as to why it does that:

Argo repo-servers runs a helm init --client-only --skip-refresh. This adds the stable repo with URL: https://kubernetes-charts.storage.googleapis.com along with the private repo I added to argo which is hosted in artifactory. Access to the private repo doesn't need proxy but since Argo repo-server pod doesn't have proxy vars set up, it can't reach the stable repo when it runs a helm repo update and hence times out and throws out an error.

Logs attached below from repo-server pod.

To Reproduce

In a proxy environment, deploy an app from ArgoCD UI or using a declarative approach (application resource) from a private helm repo that is accessible from the cluster without proxy.

Expected behavior

Argo repo-server shouldn't add the stable repo (https://kubernetes-charts.storage.googleapis.com) by default as in a proxy environment it won't have access to it and hence would time-out doing a helm repo update.

Screenshots

If applicable, add screenshots to help explain your problem.

Version

$ argocd version
argocd: v1.3.0+9f8608c
  BuildDate: 2019-11-13T01:49:01Z
  GitCommit: 9f8608c9fcb2a1d8dcc06eeadd57e5c0334c5800
  GitTreeState: clean
  GoVersion: go1.12.6
  Compiler: gc
  Platform: linux/amd64

Logs

From repo-server when I try to create an app from a helm repo

time="2019-12-05T19:59:04Z" level=info msg="manifest cache miss: 0.1.43/&ApplicationSource{RepoURL:https://artifactory.xxxxxxxxxxxxxxx,Path:,TargetRevision:0.1.43,Helm:nil,Kustomize:nil,Ksonnet:nil,Directory:nil,Plugin:nil,Chart:catalog,}"
time="2019-12-05T19:59:12Z" level=error msg="`helm repo update` failed timeout after 1m30s" execID=KOsLx
time="2019-12-05T19:59:12Z" level=error msg="finished unary call with code Unknown" error="`helm repo update` failed timeout after 1m30s" grpc.code=Unknown grpc.method=GetAppDetails grpc.request.deadline="2019-12-05T19:58:42Z" grpc.service=repository.RepoServerService grpc.start_time="2019-12-05T19:57:42Z" grpc.time_ms=90061.29 span.kind=server system=grpc
time="2019-12-05T19:59:12Z" level=info msg="helm init --client-only --skip-refresh" dir="/tmp/https:__artifactory.xxxxxxxxxxx" execID=GQGCe
time="2019-12-05T19:59:12Z" level=info msg="helm repo update" dir="/tmp/https:__artifactory.xxxxxxxxxxx" execID=W41qf
time="2019-12-05T19:59:13Z" level=info msg="manifest cache hit: &ApplicationSource{RepoURL:http://git.xxxxxxxxxxxxxx/catalog,TargetRevision:HEAD,Helm:&ApplicationSourceHelm{ValueFiles:[values.yaml],Parameters:[{apiserver.storage.etcd.persistence.enabled false false} {image quay.io/kubernetes-service-catalog/service-catalog:v0.2.1 false} {apiserver.replicas 1 false}],ReleaseName:service-catalog,Values:,},Kustomize:nil,Ksonnet:nil,Directory:nil,Plugin:nil,Chart:,}/2c9adc2df1596483711655aa1e5186e6b737992d"
time="2019-12-05T19:59:13Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GenerateManifest grpc.request.deadline="2019-12-05T20:00:12Z" grpc.service=repository.RepoServerService grpc.start_time="2019-12-05T19:59:12Z" grpc.time_ms=791.33 span.kind=server system=grpc

Going into the repo server pod and running the same commands confirms this:

$ kubectl exec -it argocd-repo-server-6f4b75f4bb-q2f9x -n argocd  bash
$ helm init --client-only --skip-refresh
Creating /home/argocd/.helm
Creating /home/argocd/.helm/repository
Creating /home/argocd/.helm/repository/cache
Creating /home/argocd/.helm/repository/local
Creating /home/argocd/.helm/plugins
Creating /home/argocd/.helm/starters
Creating /home/argocd/.helm/cache/archive
Creating /home/argocd/.helm/repository/repositories.yaml
Adding stable repo with URL: https://kubernetes-charts.storage.googleapis.com
Adding local repo with URL: http://127.0.0.1:8879/charts
$HELM_HOME has been configured at /home/argocd/.helm.
Not installing Tiller due to 'client-only' flag having been set
argocd@argocd-repo-server-6f4b75f4bb-q2f9x:~$ helm repo add <redacted>
"xx-helm-dev" has been added to your repositories
argocd@argocd-repo-server-6f4b75f4bb-q2f9x:~$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "xx-helm-dev" chart repository
...Unable to get an update from the "stable" chart repository (https://kubernetes-charts.storage.googleapis.com):
        Get https://kubernetes-charts.storage.googleapis.com/index.yaml: dial tcp 216.58.196.80:443: connect: connection timed out
Update Complete.
alexec commented 4 years ago

Would you like to submit a PR to fix this?

afanrasool commented 4 years ago

Hi @alexec. I haven't dived deep in to the source code yet. Will need to take a look at it and open a PR if I find the fix for it.

niiku commented 4 years ago

@alexec @afanrasool I just run into the same issue. I see the following possible solutions, based on this issue here: https://github.com/helm/helm/issues/3749

Set --stable-repo-url

Allow defining the --stable-repo-url argument while executing helm init in util/helm/cmd.go#L42 based on a entry in the argocd-cm ConfigMap. The ConfigMap entry could look like this:

data:
  helm.stable.repo.url: https://charts.example.tld # default: https://kubernetes-charts.storage.googleapis.com

If this entry is set, the helm init command would be executed with the argument --stable-repo-url $helm.stable.repo.url

Disabling stable repo

Add the possibility to remove the stable repo if the corresponding config is set in the argocd-cm ConfigMap. Such a ConfigMap entry could look like this:

data:
  helm.stable.repo.enabled: 'false' # default: true

To implement this, an additional function called removeStableRepo is added in util/helm/cmd.go. The execution of this function would be more or less:

helm repo remove stable

This function would be called from util/helm/client.go#L116 when the config value helm.stable.enabled is false.

If I overlook a possible solution, let me know. It would probably be best if both variants were implemented to provide the necessary flexibility for different circumstances (e.g. required internal mirror of the stable repo, complete exclusion of the stable repo). The config value helm.stable.repo.url would be ignored in the case of helm.stable.repo.enabled: 'false'.

What are your thoughts? If nothing speaks against my suggestions, I would prepare an appropriate PR.

afanrasool commented 4 years ago

@niiku Thats exactly where my head was at for a potential fix, but haven't had chance to work on the PR. If you can do that, that would be awesome!

alexec commented 4 years ago

I was wondering. The Helm stable repository is going away (see https://github.com/helm/charts#deprecation-timeline) and therefore we should remove support from it. At the same time we should be upgrading to Helm 3 (see #2864). Should this be part of that work?

niiku commented 4 years ago

@alexec I think Argo CD shouldn't drop the Helm stable repository per default until Nov 13, 2020, since it normally doesn't hurt. Especially since the removal would break the default usage of the new feature introduced here https://github.com/argoproj/argo-cd/issues/1145

As for Helm v3, I think there should be a long time where Argo CD works with Helm v2 and v3 in parallel to support backwards compatibility. Helm v3 has no default repository anyway, as there's no helm init - so this issue shouldn't be relevant for Helm v3.

niiku commented 4 years ago

Please excuse me, of course it is up to you how long you want to support Helm v2 and the stable repository. With the circumstances, it makes more sense to implement only the helm repo remove stable function for Helm v2. I will try to suggest a corresponding commit.

niiku commented 4 years ago

@alexec Unfortunately, I don't know exactly how I can best implement a feature flag for users currently using the helm stable repo without an entry in helm.repositories. I noticed that the reposerver does not access ConfigMaps directly and that this is not foreseen. I see two possibilities how such a flag could be implemented:

Extending ManifestRequest

To transfer the initially proposed ConfigMap entry helm.stable.repo.enabled from the argocd-server to the reposerver, the v1alpha1.ManifestRequest type could be extended by a field HelmOptions analogous to KustomizeOptions, which holds the corresponding configuration value.

Adding a cmd parameter

The cmd argocd-repo-server could be extended by a parameter --enable-helm-stable-repo or similar. If this is set to false, the helm stable repo is removed. The advantage of this solution would be that no api types have to be modified.

Did I miss an easier way? Or would it be acceptable to remove the helm stable repository every time helm init is run? Could you give me some input on how to proceed in fixing this issue? This would help me a lot to solve this issue in an appropriate way.

niiku commented 4 years ago

Hi @alexec, hope you had great holidays and a happy new year. Would it be possible that you could point me in the right direction with my previous questions? That would be really great :-)

alexec commented 4 years ago

I'd suggest that we add no repos by default. This is because the Helm stable repo is no longer recommend way to distribute you app.

This should probably done with the Helm v3 #2383 support.

zeph commented 4 years ago

@afanrasool I trick my setup with setting the unix environment variable that are normally picked up by any library... therefore this is absolutely not a problem for me. Maybe we shall have a documentation's patch. Have time for it? The below goes to the deploy resources of: argocd-repo-server,argocd-application-controller,argocd-server

env:
- name: HTTP_PROXY
  value: http://XXX-dsi-proxy:3128
- name: HTTPS_PROXY
  value: http://XXX-dsi-proxy:3128
- name: NO_PROXY
  value: 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.cluster.local,argocd-repo-server,.test.dsp.XXX.de,.shared.dsp.XXX.de,.dst.kube
zeph commented 4 years ago

@niiku I would simply close this ticket based on my comment above, have u tested it for ur setup?

niiku commented 4 years ago

@zeph Yes, I tested the proxy trick, but as we're doing a "man-in-the-middle" I would need to configure the certificate from our webproxy in the repo server. And there should be no reason for the repo server to have access to the internet (so people can't use other chart repositories than ours).

zeph commented 4 years ago

p.s. this issue applies also if using DEX to integrate Azure's SSO for example

niiku commented 4 years ago

@alexec So v1.5 is going to have some breaking changes I guess? Or are you going to support Helm v2 and v3 at the same time? If that's the case, a corresponding fix for Helm v2 would still make sense.

jkleinlercher commented 4 years ago

We are also behind a http proxy and as I understand the discussion there is no workaround to get local helm repos working? there is no workaround to disable the default stable helm repo and also http_proxy env variables aren't working. Am I right? From my perspective I think a feature to disable the default repo with a cmd parameter described above would be perfect, because adding a no_proxy env for all local communication can be quite error-prone.

niiku commented 4 years ago

@jkleinlercher Thanks for the feedback. I‘m going to submit a corresponding PR in the next couple of days.

zeph commented 4 years ago

@niiku hold the fire... I already provided a patch for https://github.com/argoproj/argo-cd/issues/3055

zeph commented 4 years ago

@jkleinlercher ah, u want a solution not based on env variables? uhmm

jkleinlercher commented 4 years ago

@zeph PR #3063 is of course very useful if you want to retrieve helm charts from the internet via a forward-proxy. However, imho additionally it is very good to disable the default helm repo if you don't need it. maintaining a non_proxy list is very error-prone from my experience, so I do not want to have it if I don't need it.

afanrasool commented 4 years ago

@zeph Apologies for the late reply..catching up on this now. I have set up env vars as a workaround but I would agree with @jkleinlercher above ^^.

zeph commented 4 years ago

@afanrasool at this point I fear I don't get the full complexity of your problem, nevermind What I care about is the PR #3063 because that blocks me currently ...

jkleinlercher commented 4 years ago

I found this document https://argoproj.github.io/argo-cd/faq/#argo-cd-cannot-deploy-helm-chart-based-applications-without-internet-access-how-can-i-solve-it seems like an easy workaround @afanrasool and @niiku

MattLud commented 4 years ago

@jkleinlercher - I tried setting that up but it didn't seem to work for me.

I was writing up an example repo for my team on this and tried using the Off-the-shelf example, trying to get it to use our mirror of the wordpress chart.

argocd@argocd-repo-server-557bc7c877-chsrm:/tmp/https:__<git_REPO>/staging/OTS-example$ helm --debug dep up       
Hang tight while we grab the latest from your chart repositories...
...Unable to get an update from the "local" chart repository (http://127.0.0.1:8879/charts):
    Get http://127.0.0.1:8879/charts/index.yaml: dial tcp 127.0.0.1:8879: connect: connection refused
...Successfully got an update from the "stable" chart repository
Update Complete.
Saving 1 charts
Downloading wordpress from repo https://<internal-mirror>/kubernetes-charts-storage-googleapis-com
Save error occurred:  could not download https://kubernetes-charts.storage.googleapis.com/wordpress-1.0.9.tgz: Get https://kubernetes-charts.storage.googleapis.com/wordpress-1.0.9.tgz: dial tcp 172.217.163.144:443: connect: connection timed out
Deleting newly downloaded charts, restoring pre-update state
Error: could not download https://kubernetes-charts.storage.googleapis.com/wordpress-1.0.9.tgz: Get https://kubernetes-charts.storage.googleapis.com/wordpress-1.0.9.tgz: dial tcp 172.217.163.144:443: connect: connection timed out
alex1989hu commented 4 years ago

Deployed latest argo-cd with PR #3063. 🎉 I was able to add bitnami helm repository - we have MITM proxy - but not able to add kubernetes-charts.storage.googleapis.com. This is where it comes so strange.

I noticed that argo-cd deployed metrics-server somehow. Pod is running but argo-cd shows:

ComparisonError: helm repo add --ca-file /app/config/tls/kubernetes-charts.storage.googleapis.com stable https://kubernetes-charts.storage.googleapis.com

Furthermore, I checked that I cannot either deploy any application or browse charts due to x509: certificate signed by unknown authority. All the above works with bitnami helm repository. I think argo-cd handles specially the stable repository.

abdennour commented 3 years ago

So Offline environment and disconnected cluster ! Right ? Then,


apiVersion: v1
data:
  Corefile: |
    .:53 {
        log
        errors
        health {
          lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
          ttl 30
        }
        file /etc/coredns/helm-repo-stable.db storage.googleapis.com
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
  helm-repo-stable.db: |
    storage.googleapis.com. 5 IN    SOA ns1.dns.storage.googleapis.com. hostmaster.storage.googleapis.com. (
            12345      ; serial
            14400      ; refresh (4 hours)
            3600       ; retry (1 hour)
            604800     ; expire (1 week)
            5          ; minimum (4 hours)
            )
    storage.googleapis.com.     5 IN    NS ns1.dns.storage.googleapis.com.
    ns1.dns.storage.googleapis.com.  5 IN  A 10.96.0.10 ; Refer to IP of kube-dns.kube-system.svc.cluster.local.
    kubernetes-charts.storage.googleapis.com.    5   IN      A       192.168.19.12 ; Refer to one of nodes IP

In this example I resolved kubernetes-charts.storage.googleapis.com. to an existing IP which one of my node IPs ( 192.168.19.12)

Also, I've added this line above file /etc/coredns/helm-repo-stable.db storage.googleapis.com Notice also 10.96.0.10 which is the Cluster IP of the service 'kube-dns' (kube-dns.kube-system.svc.cluster.local.)

Take care

jdoylei commented 2 years ago

@zeph - Thank you for the workaround you posted above!

I've been trying to use Argo CD in our environment with Helm like this, following some pointers at Continuous Delivery with Helm and Argo CD:

Our environment requires an HTTP proxy to reach the external Helm chart repository but not the internal Git repository. (This is just due to a mixture of SaaS and on-prem that our company happens to use.)

I was not able to get this to work without your workaround (adding proxy-related env vars to argocd-repo-server).

Note, I could use Argo CD's Repositories feature to define a Helm-type Repository and give it a Proxy URL. But the Proxy URL only worked if I then used the Repository to create a Helm-type Application, which isn't really what I wanted to do. When Argo CD processes Helm charts recursively via dependencies, it does a helm repo add command, and that helm repo add correctly uses authentication configured in the Repository (username and password) but not the Proxy URL configured in the same Repository.

It seems like Argo CD is close to handling this - if Argo CD could pick up on that existing Proxy setting in the Repository, when processing Helm chart dependencies.