VictoriaMetrics / helm-charts

Helm charts for VictoriaMetrics, VictoriaLogs and ecosystem
https://victoriametrics.github.io/helm-charts/
Apache License 2.0
332 stars 325 forks source link

bug: Upgrade from 0.24.5 to 0.25.17 fails, with incompatible Kubernetes version #1529

Open passie opened 1 week ago

passie commented 1 week ago

Chart name and version chart: victoria-metrics-k8s-stack version: version: 0.25.17

Describe the bug After upgrading from 0.24.5 to 0.25.17 we see the following error in ArgoCD

Failed to load target state: failed to generate manifest for source 1 of 1: rpc error: code = Unknown desc = Manifest generation error (cached): plugin sidecar failed. error generating manifests in cmp: rpc error: code = Unknown desc = error generating manifests:argo-cd-helmfile.sh generatefailed exit status 1: helm version v3.15.3+g3bb50bb helmfile version 0.167.1 starting generate in ./helmfile.yaml: [exit status 1 COMMAND: kustomize build /tmp/chartify603341198/victoria-metrics/victoria-metrics/victoria-metrics-k8s-stack --output /tmp/chartify603341198/victoria-metrics/victoria-metrics/victoria-metrics-k8s-stack/all.patched.yaml --enable-alpha-plugins OUTPUT: Error: no matches for Id Service.v1.[noGrp]/vm-coredns.kube-system; failed to find unique target for patch Service.v1.[noGrp]/vm-coredns.kube-system]

When rendering the helm chart with helmfile template -e cluster01 we are receiving the following error message:

STDERR:
  Error: chart requires kubeVersion: >=1.25.0-0 which is incompatible with Kubernetes v1.24.0
  Use --debug flag to render out invalid YAML

Though our clusters are running version 1.26.5

Client Version: v1.25.6+vmware.wcp.2
Kustomize Version: v4.5.7
Server Version: v1.26.5+vmware.2-fips.1

We are running ArgoCD version 2.12.3

AndrewChubatiuk commented 1 week ago

is argocd running on the same cluster?

passie commented 1 week ago

is argocd running on the same cluster?

Yes

AndrewChubatiuk commented 1 week ago

please try adding --validate flag

passie commented 1 week ago
STDERR:
  Error: template: victoria-metrics-k8s-stack/charts/victoria-metrics-operator/templates/uninstall_hook.yaml:27:20: executing "victoria-metrics-k8s-stack/charts/victoria-metrics-operator/templates/uninstall_hook.yaml" at <include "vm.image" (dict "helm" . "app" $app)>: error calling include: template: victoria-metrics-k8s-stack/charts/victoria-metrics-operator/charts/victoria-metrics-common/templates/_image.tpl:7:17: executing "vm.image" at <tpl (printf "%s:%s" .app.image.repository (.app.image.tag | default $Chart.AppVersion)) .>: error calling tpl: cannot retrieve Template.Basepath from values inside tpl function: bitnami/kubectl:1.26: "BasePath" is not a value
  Use --debug flag to render out invalid YAML

COMBINED OUTPUT:
  Error: template: victoria-metrics-k8s-stack/charts/victoria-metrics-operator/templates/uninstall_hook.yaml:27:20: executing "victoria-metrics-k8s-stack/charts/victoria-metrics-operator/templates/uninstall_hook.yaml" at <include "vm.image" (dict "helm" . "app" $app)>: error calling include: template: victoria-metrics-k8s-stack/charts/victoria-metrics-operator/charts/victoria-metrics-common/templates/_image.tpl:7:17: executing "vm.image" at <tpl (printf "%s:%s" .app.image.repository (.app.image.tag | default $Chart.AppVersion)) .>: error calling tpl: cannot retrieve Template.Basepath from values inside tpl function: bitnami/kubectl:1.26: "BasePath" is not a value
  Use --debug flag to render out invalid YAML
AndrewChubatiuk commented 1 week ago

helm 3.14+ is required

passie commented 1 week ago

helm 3.14+ is required

We are running ArgoCD 2.12.3 which has helm v3.15.2+g1a500d5.

AndrewChubatiuk commented 1 week ago

looks like this error you've got not from argocd, but from local execution


STDERR:
  Error: template: victoria-metrics-k8s-stack/charts/victoria-metrics-operator/templates/uninstall_hook.yaml:27:20: executing "victoria-metrics-k8s-stack/charts/victoria-metrics-operator/templates/uninstall_hook.yaml" at <include "vm.image" (dict "helm" . "app" $app)>: error calling include: template: victoria-metrics-k8s-stack/charts/victoria-metrics-operator/charts/victoria-metrics-common/templates/_image.tpl:7:17: executing "vm.image" at <tpl (printf "%s:%s" .app.image.repository (.app.image.tag | default $Chart.AppVersion)) .>: error calling tpl: cannot retrieve Template.Basepath from values inside tpl function: bitnami/kubectl:1.26: "BasePath" is not a value
  Use --debug flag to render out invalid YAML

COMBINED OUTPUT:
  Error: template: victoria-metrics-k8s-stack/charts/victoria-metrics-operator/templates/uninstall_hook.yaml:27:20: executing "victoria-metrics-k8s-stack/charts/victoria-metrics-operator/templates/uninstall_hook.yaml" at <include "vm.image" (dict "helm" . "app" $app)>: error calling include: template: victoria-metrics-k8s-stack/charts/victoria-metrics-operator/charts/victoria-metrics-common/templates/_image.tpl:7:17: executing "vm.image" at <tpl (printf "%s:%s" .app.image.repository (.app.image.tag | default $Chart.AppVersion)) .>: error calling tpl: cannot retrieve Template.Basepath from values inside tpl function: bitnami/kubectl:1.26: "BasePath" is not a value
  Use --debug flag to render out invalid YAML
AndrewChubatiuk commented 1 week ago

your argocd is complaining on service, that it's not able to find in a rendered template

Error: no matches for Id Service.v1.[noGrp]/vm-coredns.kube-system; failed to find unique target for patch Service.v1.[noGrp]/vm-coredns.kube-system]

can i see your helmfile config?

passie commented 1 week ago

My bad you are correct. I was running the validate for you from commandline which had a older version of helm. I have upgraded the helm version to 3.16.

Now getting the same error as Argocd.

STDERR:
  Error: Unable to continue with install: ServiceAccount "victoria-metrics-kube-state-metrics" in namespace "victoria-metrics" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "victoria-metrics"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "victoria-metrics"

COMBINED OUTPUT:
  Error: Unable to continue with install: ServiceAccount "victoria-metrics-kube-state-metrics" in namespace "victoria-metrics" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "victoria-metrics"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "victoria-metrics"
AndrewChubatiuk commented 1 week ago

could you please share your helmfile configuration?

passie commented 6 days ago

This is our app.yaml

# ArgoCD application settings
---
name: vmagent
namespace: victoria-metrics
createNamespace: true
serverSideApply: false
replace: false
PruneLast: false

This is our helmfile.yaml

---
repositories:
  - name: nexus-victoria-metrics
    url: https://repo.example.com/repository/helm-proxy-victoria-metrics

releases:
  - name: victoria-metrics
    namespace: victoria-metrics
    chart: nexus-victoria-metrics/victoria-metrics-k8s-stack
    missingFileHandler: Warn
    version: 0.25.17
    values:
      - ./values.yaml.gotmpl
AndrewChubatiuk commented 6 days ago

this message

  Error: Unable to continue with install: ServiceAccount "victoria-metrics-kube-state-metrics" in namespace "victoria-metrics" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "victoria-metrics"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "victoria-metrics"

shows, that service account was installed by other helm release @passie could you please make sure that all expected annotations are present on resources, that are managed by k8s-stack chart and that there's no resource name collision?

passie commented 6 days ago

All resources are managed by ArgoCD. You are correct that not all resources have the annotation on them. I'm trying to workout a method so I can set them manually. like:

export NAMESPACE=victoria-metrics
export RELEASE_NAME=victoria-metrics
kubectl get sa | awk '{print $1 }' | xargs -i kubectl annotate sa {} meta.helm.sh/release-namespace="$NAMESPACE" meta.helm.sh/release-name="$RELEASE_NAME"  --overwrite

Though i keep on getting more resources which are missing the annotations. Not sure what has changed. I'll keep you posted on my progress

passie commented 5 days ago

It looks like there are 2 problems.

1) the resources vm-coredns where renamed to vm-core-dns Since we use strategicMergePatches: on these resources these needed to be changed to the new names as well. old: name: vm-coredns new: name: vm-core-dns

    strategicMergePatches:
      - apiVersion: v1
        kind: Service
        metadata:
          name: vm-core-dns
          namespace: kube-system
        $patch: delete

I couldn't find the changelog for this. Maybe I'm overlooking.

2) The annotation problem, which I haven't found the cause. Could be something todo with helmfile. Not sure yet.

AndrewChubatiuk commented 5 days ago

we recently changed a logic of servicemonitor, service, endpoints template rendering and looks like it caused these name changes. instead of using kustomize you can disable service using coreDns.service.enabled: false