fluxcd / source-controller

The GitOps Toolkit source management component
Apache License 2.0
231 stars 183 forks source link

Receiving chart pull error on environment with a proxy - EOF #1485

Closed Valgueiro closed 3 weeks ago

Valgueiro commented 1 month ago


I have my k8s cluster deployed behind a firewall, that only allows connections from a proxy that is on the same network.



Flux version: v2.1.2 Source controller version: 1.1.2 I've setup the gotk as such to be able to use the proxy to fetch things.

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
  - gotk-components.yaml
  - patch: |
      apiVersion: apps/v1
      kind: Deployment
        name: all
              - name: manager
                  - name: "HTTPS_PROXY"
                    value: "http://proxy.com:3128"
                  - name: "NO_PROXY"
                    value: ".cluster.local.,.cluster.local,cluster.local,.svc,,"  
                  - name: "https_proxy"
                    value: "http://proxy.com:3128"
                  - name: "no_proxy"
                    value: ".cluster.local.,.cluster.local,cluster.local,.svc,,"     
      kind: Deployment
      labelSelector: app.kubernetes.io/part-of=flux

And I have HelmRelease and helmrepo configured like this:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
  name: keda
  namespace: keda
  interval: 5m0s
  releaseName: keda
    createNamespace: true
      chart: keda
      version: '2.12.1'
        kind: HelmRepository
        name: charts
        namespace: keda
  - kind: ConfigMap
    name: keda-values
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
  name: charts
  namespace: keda
  type: "oci"
  interval: 5m0s
  url: oci://<acr>/sre/charts/
    name: registry-pull-secret
    name: tls-ca

My HelmRepo is showing as active, but the HelmChart is showing as "Reconciling" and I can see the following error:

chart pull error: failed to download chart for remote reference: failed to get 'oci://<acr>/sre/charts/keda:2.12.1': failed to do request: Head "https://<acr>/v2/sre/charts/keda/manifests/2.12.1": EOF

I thought that this could be related to this issue about http_proxy on busybox images: https://github.com/mirror/busybox/issues/21 , and after that I tried with this docker image as source-controller:

FROM <acr>/sre/fluxcd/source-controller:v1.1.2
USER root

COPY zscaler.crt /etc/ssl/certs/
RUN update-ca-certificates

RUN apk --no-cache -U add openssl wget ca-certificates
# wget https://httpbin.org/get

USER 65534:65534

But I continued to receive the same error.

Do you guys have any idea of what I can do to fix this?

Valgueiro commented 1 month ago

Other things that can be useful here:

  1. The same setup works when I remove the firewall and proxy from the architecture.
  2. This is the output of the command when I try to do a HEAD request from the source-controller container
~ $ wget --spider https://<acr>/v2/sre/rancher-alerting-drivers/manifests/102.1.0
Spider mode enabled. Check if remote file exists.
--2024-05-13 22:07:41--  https://<acr>/v2/sre/rancher-alerting-drivers/manifests/102.1.0
Resolving proxy.com ( proxy.com)... <proxy-ip>
Connecting to proxy.com (proxy.com)|<proxy-ip>|:3128... connected.
Proxy request sent, awaiting response... 401 Unauthorized
  1. I tried to debug the code myself but I couldn't get much further. From what I could understand the error is popped from here: https://github.com/fluxcd/source-controller/blob/f8eea53bda618099f7f633ae289c8200b0cb3555/internal/helm/chart/builder_remote.go#L161 more specifically when calling Client.get https://github.com/fluxcd/source-controller/blob/f8eea53bda618099f7f633ae289c8200b0cb3555/internal/helm/repository/chart_repository.go#L281
Valgueiro commented 1 month ago

Just confirmed here with tcpdump that source-controller is sending requests directly to the OCI URL without using proxy. This should not be happening since the proxy is setup on the flux services like the doc suggests

stefanprodan commented 1 month ago

Can you please try with an OCIRepository and see if that works, example here https://fluxcd.io/blog/2024/05/flux-v2.3.0/#enhanced-helm-oci-support

souleb commented 1 month ago

This is fixed in https://github.com/helm/helm/commit/94c1deae6d5a43491c5a4e8444ecd8273a8122a1 I believe. Upgrading helm to v3.15.0 in source-controller should resolve this

stefanprodan commented 1 month ago

Switching to OCIRepo and HelmRelease v2 should work as we don’t use the Helm getter in OCIRepo.

Valgueiro commented 1 month ago

I tried to just update to the latest flux version which uses a version of helm that was already fixed ( 1.3.0 source controller points to 3.14.4) but still maintaining the HelmRepository and I did not have success. I will give the OCIRepo a try.

souleb commented 1 month ago

As I wrote above, it is fixed in helm v3.15.0. We have not updated Flux to that version yet. I would try Stefan suggestion on Flux v2.3.0.

Valgueiro commented 1 month ago

As I wrote above, the fix is already on flux version 2.3.0. Even the guy who made the fix himself bumped another repository to 3.14.4 to fix his issue. As you can see on the link to the code on 3.14.4, it is already there! Which means that 2.3.0 already have this fix.


So, bumping the version of helm 3.15 in the future must not solve the issue that I am facing.

souleb commented 1 month ago

Thanks @Valgueiro, indeed we instantiate our own http.Transport. This will be fixed in the next Flux minor.