fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
396 stars 156 forks source link

Malformed templating of helm values that contain quotes #656

Open onedr0p opened 1 year ago

onedr0p commented 1 year ago

Hi all 👋🏼

I am using the latest released version of Flux (v0.41.2)

I am trying to install the helm chart thanos from https://github.com/stevehipwell/helm-charts

Source: https://github.com/stevehipwell/helm-charts/tree/main/charts/thanos

When Flux installs this chart I can see certain malformities in the rendered output.

Here is my HelmRelease:

---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: thanos
  namespace: monitoring
spec:
  interval: 15m
  chart:
    spec:
      chart: thanos
      version: 1.12.0
      sourceRef:
        kind: HelmRepository
        name: stevehipwell
        namespace: flux-system
  maxHistory: 3
  install:
    createNamespace: true
    remediation:
      retries: 3
  upgrade:
    cleanupOnFail: true
    remediation:
      retries: 3
  uninstall:
    keepHistory: false
  values:
    additionalReplicaLabels: ["__replica__"]
    objstoreConfig:
      key: objstore.yml
      value:
        type: s3
        config:
          insecure: true
    compact:
      enabled: true
      extraArgs:
        - --compact.concurrency=4
        - --delete-delay=30m
        - --retention.resolution-1h=14d
        - --retention.resolution-5m=14d
        - --retention.resolution-raw=14d
      persistence:
        enabled: true
        storageClass: local-path
        size: 20Gi
    query:
      enabled: true
      replicas: 3
    queryFrontend:
      enabled: true
      replicas: 3
      ingress:
        enabled: true
        ingressClassName: nginx
        annotations:
          nginx.ingress.kubernetes.io/whitelist-source-range: |
            10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
          hajimari.io/enable: "false"
        hosts:
          - &host thanos-query-frontend.devbu.io
        tls:
          - hosts:
              - *host
    receive:
      enabled: true
      replicationFactor: 3
      retention: 12h
      router:
        replicas: 3
        extraArgs:
          - --receive.hashrings-algorithm=ketama
      ingestor:
        replicas: 3
        extraArgs:
          - --tsdb.wal-compression
        persistence:
          enabled: true
          storageClass: local-path
          size: 20Gi
    storeGateway:
      replicas: 3
      persistence:
        enabled: true
        storageClass: local-path
        size: 20Gi
    rule:
      enabled: true
      replicas: 3
      extraArgs:
        - --web.prefix-header=X-Forwarded-Prefix
      alertmanagersConfig:
        value: |-
          alertmanagers:
            - api_version: v2
              scheme: http
              timeout: 30s
              static_configs:
                - kube-prometheus-stack-alertmanager.monitoring.svc.cluster.local:9093
      rules:
        value: |-
          groups:
            - name: PrometheusWatcher
              rules:
                - alert: PrometheusDown
                  annotations:
                    summary: A Prometheus has disappeared from Prometheus target discovery
                  expr: absent(up{job="kube-prometheus-stack-prometheus"})
                  for: 10m
                  labels:
                    severity: critical
      persistence:
        enabled: true
        storageClass: local-path
        size: 20Gi
    serviceMonitor:
      enabled: true
  valuesFrom:
    - targetPath: objstoreConfig.value.config.bucket
      kind: ConfigMap
      name: thanos-bucket-v1
      valuesKey: BUCKET_NAME
    - targetPath: objstoreConfig.value.config.endpoint
      kind: ConfigMap
      name: thanos-bucket-v1
      valuesKey: BUCKET_HOST
    - targetPath: objstoreConfig.value.config.region
      kind: ConfigMap
      name: thanos-bucket-v1
      valuesKey: BUCKET_REGION
    - targetPath: objstoreConfig.value.config.access_key
      kind: Secret
      name: thanos-bucket-v1
      valuesKey: AWS_ACCESS_KEY_ID
    - targetPath: objstoreConfig.value.config.secret_key
      kind: Secret
      name: thanos-bucket-v1
      valuesKey: AWS_SECRET_ACCESS_KEY

Now onto the helm-operator rendered output of the thanos-receive-router-hashrings configmap:

apiVersion: v1
data:
  hashrings.json: |
    [{
      "hashring": default",
      tenants": [],
      endpoints": [
        "thanos-receive-ingestor-0.thanos-receive-ingestor-headless.monitoring.svc.cluster.local:10901"
        "thanos-receive-ingestor-1.thanos-receive-ingestor-headless.monitoring.svc.cluster.local:10901"
        "thanos-receive-ingestor-2.thanos-receive-ingestor-headless.monitoring.svc.cluster.local:10901"
      ]
    }]
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: thanos
    meta.helm.sh/release-namespace: monitoring
  creationTimestamp: "2023-03-28T16:18:36Z"
  labels:
    app.kubernetes.io/component: receive-router
    app.kubernetes.io/instance: thanos
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: thanos
    app.kubernetes.io/version: 0.31.0
    helm.sh/chart: thanos-1.12.0
    helm.toolkit.fluxcd.io/name: thanos
    helm.toolkit.fluxcd.io/namespace: monitoring
  name: thanos-receive-router-hashrings
  namespace: monitoring
  resourceVersion: "74731185"
  uid: 8b01c3e2-d7db-4e4c-99b1-e5249b198a23

You can see that the templated data in the configmap has malformed JSON:

    [{
      "hashring": default",
      tenants": [],
      endpoints": [
        "thanos-receive-ingestor-0.thanos-receive-ingestor-headless.monitoring.svc.cluster.local:10901"
        "thanos-receive-ingestor-1.thanos-receive-ingestor-headless.monitoring.svc.cluster.local:10901"
        "thanos-receive-ingestor-2.thanos-receive-ingestor-headless.monitoring.svc.cluster.local:10901"
      ]
    }]

Now when I take my Helm values and dump them into a file called values.yaml I see the correct rendered output:

Command:

helm template stevehipwell/thanos --version 1.12.0 --values values.yaml

Truncated output where you can see the quotes are properly enclosed:

data:
  hashrings.json: |
    [{
      "hashring": "default",
      "tenants": [],
      "endpoints": [
        "release-name-thanos-receive-ingestor-0.release-name-thanos-receive-ingestor-headless.default.svc.cluster.local:10901"
        "release-name-thanos-receive-ingestor-1.release-name-thanos-receive-ingestor-headless.default.svc.cluster.local:10901"
        "release-name-thanos-receive-ingestor-2.release-name-thanos-receive-ingestor-headless.default.svc.cluster.local:10901"
      ]
    }]

I have also noticed this for other things, like in the args section:

      containers:
      - args:
        - receive
        - --log.level=info
        - --log.format=logfmt
        - --grpc-address=0.0.0.0:10901
        - --http-address=0.0.0.0:10902
        - --remote-write.address=0.0.0.0:19291
        - --objstore.config-file=/etc/thanos/objstore.yaml
        - --tsdb.path=/var/thanos/receive
        - --tsdb.retention=12h
        - --label=receive_replica="$(NAME) # This quote is enclosed when using `helm template`
        - --receive.local-endpoint=$(NAME).thanos-receive-ingestor.monitoring-headless.svc.cluster.local:10901
        - --tsdb.wal-compression

Does anyone know what could be going on here?

hiddeco commented 1 year ago

Little time available to investigate this issue this week, as we are wrapping up some other big changes to Flux.

One thing that comes to mind is that it may be the effect of Kustomize applying labels to track the release to a HelmRelease. If this is the case, it should be reproducible with helm by using a post renderer which pulls it through Kustomize build.

onedr0p commented 1 year ago

Seems like this could be related to https://github.com/fluxcd/helm-controller/issues/383 then :/