fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
410 stars 163 forks source link

Helm Chart installation failed due to post renderer yaml unmarshal error #283

Closed Jimmy-Newtron closed 3 weeks ago

Jimmy-Newtron commented 3 years ago

Describe the bug

We were used to deploy the ErpNext Helm Chart using Flux.

Since the Flux upgrade to version 0.15.3 we are facing the following error message:

Helm install failed: error while running post render on files: map[string]interface {}(nil): yaml: unmarshal errors:
  line 12: mapping key "app.kubernetes.io/name" already defined at line 8

PS: we are able to deploy the chart with a manual Helm install/upgrade command

Root cause

The helm chart is generating the follow yaml resource:

# Source: erpnext/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: test-erpnext-redis
  labels:
    helm.sh/chart: erpnext-3.2.8
    app.kubernetes.io/name: erpnext
    app.kubernetes.io/instance: test
    app.kubernetes.io/version: "v13.5.1"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: erpnext-redis

source code: https://github.com/frappe/helm/blob/c27ba7d38687a9a67331ab555e369b16c79685b5/erpnext/templates/serviceaccount.yaml#L15

The double presence of the label app.kubernetes.io/name is the most likely root cause

Reproduce the issue

Simply follow the ErpNext guide

https://helm.erpnext.com/

Expected behavior

As we are able to deploy the chart with a manual Helm install/upgrade command, we expect that Flux can deploy the chart without the mentioned issue

Additional context

Below please provide the output of the following commands:

$ flux --version
flux version 0.15.3
$ flux check
► checking prerequisites
✔ kubectl 1.19.0 >=1.18.0-0
✔ Kubernetes 1.19.7 >=1.16.0-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.11.1
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.13.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.15.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.15.2
✔ all checks passed
stefanprodan commented 3 years ago

In my option this is a Helm bug, as helm install should error out with duplicate keys instead of silently dropping them and chose some random value for that key.

hiddeco commented 3 years ago

Since the Flux upgrade to version 0.15.3 we are facing the following error message

Is there a Flux (or helm-controller) version prior to this where it did work without issues?

Jimmy-Newtron commented 3 years ago

helm-controller : v0.10.0 Flux version : v0.13.2

stefanprodan commented 3 years ago

@hiddeco kyaml will error out on duplicate keys, as the YAML is invalid, it used to panic see https://github.com/kubernetes-sigs/kustomize/issues/3480

hiddeco commented 3 years ago

I think the chart should indeed not generate a duplicate annotation label, as that is the root issue here.

The reason why it works for helm install|upgrade, is because we use default Kustomize post-render to label resources as being managed via Helm through the controller, which is something helm itself does not do.

Jimmy-Newtron commented 3 years ago

I have asked to modify the helm chart and here you can see the modification

https://github.com/frappe/helm/pull/92

The modification will fix the issue and the problem between Flux and Helm behaviors is a little bit too complex to be tracked here

I will close this issue once the chart will install correctly

Thanks for the support and the explanation of few internals

boeboe commented 3 years ago

+1

I had a similar issue of having a double entry generated by helm. Something that works for helm (helm or kubectl seem to take care of those double entries).

I was able to fork my chart and fix it, but it would be more robust and fool proof if flux would take care of those double entries as well, prior to the reconciliation cycle.

janlauber commented 3 years ago

I faced similar issues with the citrix-ingress-controller helm chart and started debugging it. So I just found the duplicate line in it and opened a PR which fixes the issue for me:

https://github.com/citrix/citrix-helm-charts/pull/101

Feel free to have a look at the PR to understand what I did (and probably leave a like to create some attention) greez Jan

monotek commented 2 years ago

Just run into the same problem while upgrading from 0.18.0. to 0.23.0 Would be nice the error message could point to the exact part, where it's failing. I was not able to find my duplicate "app" label at first.

ilya-scale commented 2 years ago

I had the same error :

Helm install failed: error while running post render on files: map[string]interface {}(nil): yaml: unmarshal errors:
 line 19: mapping key "apiVersion" already defined at line 2
 line 34: mapping key "apiVersion" already defined at line 2

when I failed to write a proper separator between manifests in the same file (---). While it was correct that it failed, the error was rather hard to understand, perhaps the name of the file in the output would make it easier (since line numbers seem to refer to a given file)?

mydoomfr commented 2 years ago

I spent too much time trying to debug this until I figured out that the problem was the chart itself. I was confused because the chart was working well when deployed with helm install but not with the helm-controller (with the same values!)...

Helm install failed: error while running post render on files: map[string]interface {}(nil): yaml: unmarshal errors:
line 52: mapping key "apiVersion" already defined at line 2
line 51: mapping key "kind" already defined at line 3
line 53: mapping key "metadata" already defined at line 4
line 60: mapping key "spec" already defined at line 13

To understand the root-cause inside the chart I used Helm template test command to print the Yaml templating result

helm template test .
---
# Source: sonarqube/templates/networkpolicy.yaml
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: test-sonarqube-database
  labels:
    app: sonarqube
    chart: sonarqube-5.4.0
    release: test
    heritage: Helm
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: postgresql
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: sonarqube
      ports:
        - port: 5432
  egress:
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - port: 53
          protocol: UDP---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: test-sonarqube-additional-network-policy
  labels:
    app: sonarqube
    chart: sonarqube-5.4.0
    release: test
    heritage: Helm
spec:
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
    ports:
    - port: 9000
      protocol: TCP
  podSelector:
    matchLabels:
      app: sonarqube
  policyTypes:
  - Ingress

As you can see, the template is broken. https://github.com/SonarSource/helm-chart-sonarqube/pull/265

Subetov commented 1 year ago

Faced a similar problem. The controller cannot install the Chart due to a duplicate entries.

helm-controller  Helm install failed: error while running post render on files: map[string]interface {}(nil): yaml: unmarshal errors:  line 154: mapping key "volumes" already defined at line 147

When installing through Helm cli, there is no such problem. Also Fluxcd:v1 (helm-operator) does not have such a problem too.

kingdonb commented 1 year ago

Helm CLI and Flux v1 both used the client side apply, which is a bit more forgiving. Flux v2 does not permit duplicate entries. If you read the output of helm template on your chart, you should see the duplicate entries. Helm does not validate this at all, it simply emits a giant blob of stringly templated YAML and passes it onto the applier when it's done with templating. The applier (effectively kubectl) condenses the duplicate entries into a single one, arbitrarily dropping one or the other passed value in favor of the one that wins.

Flux v2 is validating through the server-side apply, which parses the YAML and notices things like a duplicate entry. You'll have to fix it upstream in your chart. The most common error is a label that gets placed in some manifest, and then repeated in the template that manifest includes. It should be easy to detect and isolate this error if you pass the output of helm template through a YAML linter, so it can be fixed in the next published chart release upstream.

kvaps commented 6 months ago

Hi I just wanted to leave a note how to debug such issues:

cat > kustomize.sh <<\EOT
#!/bin/bash
cat - > helm-generated-output.yaml
kustomize build . && rm helm-generated-output.yaml
EOT
chmod +x kustomize.sh

cat > kustomization.yaml <<\EOT
resources:
- helm-generated-output.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
patches: []
EOT

example usage:

$ helm template kubernetes-asd . --dry-run=server --post-renderer ./kustomize.sh
Error: error while running post render on files: error while running command /Users/kvaps/git/cozystack/packages/apps/kubernetes/kustomize.sh. error output:
Error: map[string]interface {}(nil): yaml: unmarshal errors:
  line 54: mapping key "tolerations" already defined at line 18
: exit status 1

Use --debug flag to render out invalid YAML

Cheers!

FluxCDLearner commented 3 months ago

I am getting this issue. But in my case, I have re-checked tenths of times for what flux is telling me is duplicated, and I cannot see any duped keys (yaml linting the helm templated with the values.yaml that I am using doesn't show any dupes at all). See related discussion https://github.com/fluxcd/flux2/discussions/4890

Anyone apart from @kvaps has any idea on how to debug this issues? I have tried his approach, without success (it gives me no errors...)