fluxcd-community / helm-charts

Community maintained Helm charts for Flux
Apache License 2.0
126 stars 75 forks source link

source-controller: Could not load chart: connection refused #231

Open metacoma opened 1 month ago

metacoma commented 1 month ago

Describe the bug a clear and concise description of what the bug is.

I am using the Redpanda Operator to deploy a Redpanda cluster. The Redpanda Operator utilizes Flux2 for managing the cluster's lifecycle.

While debugging an issue redpanda-data/redpanda-operator#261, I noticed that the source-controller does not listen on port 80 inside the source-controller pod.

What's your helm version?

argocd

What's your kubectl version?

Client Version: v1.29

What's the chart version?

2.12.4, 2.3.0

What happened?

$ kubectl -n redpanda get helmrelease -w
NAME        AGE     READY   STATUS
neo4j-cdc   3m42s   False   Could not load chart: failed to parse digest '': invalid checksum digest format
neo4j-cdc   4m5s    False   Could not load chart: GET http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz giving up after 10 attempt(s): Get "http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz": dial tcp 10.43.82.103:80: connect: connection refused
neo4j-cdc   4m5s    False   Could not load chart: failed to parse digest '': invalid checksum digest format
neo4j-cdc   6m23s   Unknown   Running 'install' action with timeout of 1m0s
neo4j-cdc   6m23s   Unknown   Running 'install' action with timeout of 1m0s
neo4j-cdc   7m5s    True      Helm install succeeded for release redpanda/neo4j-cdc.v1 with chart redpanda@5.9.5
neo4j-cdc   7m35s   True      Helm install succeeded for release redpanda/neo4j-cdc.v1 with chart redpanda@5.9.5
neo4j-cdc   34m   False   Could not load chart: GET http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz giving up after 10 attempt(s): Get "http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz": dial tcp 10.43.82.103:80: connect: connection refused

There is no program listening on port 80 inside the source-controller pod

kubectl -n flux-system get svc source-controller -o jsonpath='{.spec.ports}'
[
  {
    "name": "http",
    "port": 80,
    "protocol": "TCP",
    "targetPort": "http"
  }
]
$ kubectl -n flux-system exec -ti deployment/source-controller -- netstat -nlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 :::8080                 :::*                    LISTEN      1/source-controller
tcp        0      0 :::9090                 :::*                    LISTEN      1/source-controller
tcp        0      0 :::9440                 :::*                    LISTEN      1/source-controller

What you expected to happen?

I expected the source-controller service to be configured properly.

How to reproduce it?

install flux2 helm chart without any values

install redpanda-operator

$ kubectl kustomize "https://github.com/redpanda-data/redpanda-operator//operator/config/crd?ref=v2.2.4-24.2.5" \
    | kubectl apply --server-side -f -
$ helm upgrade --install redpanda-controller redpanda/operator \
  --namespace redpanda-operator \
  --set image.tag=v2.2.4-24.2.5 \
  --create-namespace

apply redpanda crd to create a small single-node cluster

---
apiVersion: cluster.redpanda.com/v1alpha2
kind: Redpanda
metadata:
  name: neo4j-cdc-stream
  namespace: redpanda
spec:
  chartRef:
    timeout: 1m0s
  clusterSpec:
    resources:
      cpu:
        cores: 100m
    external:
      domain: redpanda.local
      enabled: true
      type: NodePort
    tls:
      enabled: false
      certs:
        defaults:
          caEnabled: false
        external:
          caEnabled: false
    statefulset:
      replicas: 1
      initContainers:
        setDataDirOwnership:
          enabled: true
      livenessProbe:
        timeoutSeconds: 15
      readinessProbe:
        timeoutSeconds: 15
    storage:
      persistentVolume:
        enabled: true
        size: 1Gi

Watch the Ready and Status changes for helmrelease resource

$ kubectl -n redpanda get helmrelease -w
NAME        AGE     READY   STATUS
neo4j-cdc   3m42s   False   Could not load chart: failed to parse digest '': invalid checksum digest format
neo4j-cdc   4m5s    False   Could not load chart: GET http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz giving up after 10 attempt(s): Get "http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz": dial tcp 10.43.82.103:80: connect: connection refused
neo4j-cdc   4m5s    False   Could not load chart: failed to parse digest '': invalid checksum digest format
neo4j-cdc   6m23s   Unknown   Running 'install' action with timeout of 1m0s
neo4j-cdc   6m23s   Unknown   Running 'install' action with timeout of 1m0s
neo4j-cdc   7m5s    True      Helm install succeeded for release redpanda/neo4j-cdc.v1 with chart redpanda@5.9.5
neo4j-cdc   7m35s   True      Helm install succeeded for release redpanda/neo4j-cdc.v1 with chart redpanda@5.9.5
.... 
neo4j-cdc   34m   False   Could not load chart: GET http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz giving up after 10 attempt(s): Get "http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz": dial tcp 10.43.82.103:80: connect: connection refused
neo4j-cdc   34m   False   Could not load chart: GET http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz giving up after 10 attempt(s): Get "http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz": dial tcp 10.43.82.103:80: connect: connection refused
neo4j-cdc   39m   True    Helm install succeeded for release redpanda/neo4j-cdc.v1 with chart redpanda@5.9.5
neo4j-cdc   39m   True    Helm install succeeded for release redpanda/neo4j-cdc.v1 with chart redpanda@5.9.5
neo4j-cdc   39m   False   failed to verify artifact: computed checksum 'efe3fd90bce319c79f480e13ef5ce5543cbda4850863e07c7773b363a4116c6c' doesn't match advertised ''
neo4j-cdc   43m   False   Could not load chart: GET http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz giving up after 10 attempt(s): Get "http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz": dial tcp 10.43.82.103:80: connect: connection refused
neo4j-cdc   48m   True    Helm install succeeded for release redpanda/neo4j-cdc.v1 with chart redpanda@5.9.5
neo4j-cdc   48m   True    Helm install succeeded for release redpanda/neo4j-cdc.v1 with chart redpanda@5.9.5

Enter the changed values of values.yaml?

{}

Enter the command that you execute and failing/misfunctioning.

kubectl -n flux-system exec -ti deployment/source-controller -- netstat -nlp | grep ':80 '

Anything else we need to know?

Environments: Node Configurations: single-node k8s, 6 CPU, 16 GB RAM single-node k8s, 14 CPU, 16 GB RAM

Kubernetes Versions: k3s: 1.29.X, 1.30.X

Flux Chart Version: 2.12.4, 2.3.0

Redpanda-Operator Chart Version: 0,4.20, 0.4.21, 0.4.27 (with image-tag: v2.2.2-24.2.4)

stefanprodan commented 1 month ago

This is probably an issue with your CNI, the source-controller Kubernetes Service exposes port 80 https://github.com/fluxcd-community/helm-charts/blob/main/charts/flux2/templates/source-controller-service.yaml

metacoma commented 1 month ago

This is probably an issue with your CNI, the source-controller Kubernetes Service exposes port 80 https://github.com/fluxcd-community/helm-charts/blob/main/charts/flux2/templates/source-controller-service.yaml

Yes, it is:

kubectl -n flux-system get svc source-controller -o jsonpath='{.spec.ports}'
[
  {
    "name": "http",
    "port": 80,
    "protocol": "TCP",
    "targetPort": "http"
  }
]

However, no service is listening on HTTP (port 80) inside the source-controller pod.

$ kubectl -n flux-system exec -ti deployment/source-controller -- netstat -nlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 :::8080                 :::*                    LISTEN      1/source-controller
tcp        0      0 :::9090                 :::*                    LISTEN      1/source-controller
tcp        0      0 :::9440                 :::*                    LISTEN      1/source-controller
stefanprodan commented 1 month ago

no service is listening on HTTP (port 80) inside the source-controller pod.

Why would it?

stefanprodan commented 1 month ago

Here are the Kubernetes docs on how port mapping works: https://kubernetes.io/docs/concepts/services-networking/service/#field-spec-ports

metacoma commented 1 month ago
$ kubectl -n redpanda get helmrelease -w
NAME        AGE     READY   STATUS
neo4j-cdc   3m42s   False   Could not load chart: failed to parse digest '': invalid checksum digest format
neo4j-cdc   4m5s    False   Could not load chart: GET http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz giving up after 10 attempt(s): Get "http://source-controller.flux-system.svc.cluster.local./helmchart/redpanda/redpanda-neo4j-cdc/redpanda-5.9.5.tgz": dial tcp 10.43.82.103:80: connect: connection refused
neo4j-cdc   4m5s    False   Could not load chart: failed to parse digest '': invalid checksum digest format

If you think that this issue is due to the incorrect configuration of the CNI/network plugin in my infrastructure, feel free to close this ticket.