grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.41k stars 205 forks source link

helm chart v0.1.1 fails to install with default values.yaml #765

Open hollanbm opened 6 months ago

hollanbm commented 6 months ago

What's wrong?

https://github.com/grafana/alloy/blob/main/operations/helm/charts/alloy/charts/crds/crds/monitoring.grafana.com_podlogs.yaml

The CustomResourceDefinition "podlogs.monitoring.grafana.com" is invalid: status.storedVersions[0]: Invalid value: "v1alpha1": must appear in spec.versions

Steps to reproduce

https://grafana.com/docs/alloy/latest/get-started/install/kubernetes/

I am working on standing up the grafana stack on my cluster.

I installed these charts first. grafana/loki -- chart v6.4.2 grafana/promtail -- chart v6.15.5

I have not (yet) installed prometheus, or any other grafana helm charts.

upon installing grafana/alloy chart v0.1.1 (with default values) I get the following error when the CRD's are installed.

The CustomResourceDefinition "podlogs.monitoring.grafana.com" is invalid: status.storedVersions[0]: Invalid value: "v1alpha1": must appear in spec.versions

In order to troubleshoot, I deleted the helm chart, and attempted to manually install the crds.yaml

Thats when I received the same error, and realized what was happening.

System information

k3s v1.29.3

Software version

No response

Configuration

chart is installed via flux

---
apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: alloy
  namespace: monitoring
spec:
  chart:
    spec:
      chart: alloy
      sourceRef:
        kind: HelmRepository
        name: grafana
        namespace: flux-system
      version: 0.1.1
  interval: 15m
  releaseName: alloy
  timeout: 10m
   install:
     crds: CreateReplace
     remediation:
       retries: 1
       remediateLastFailure: true
   upgrade:
     crds: CreateReplace
     cleanupOnFail: true
     remediation:
       retries: 1
       remediateLastFailure: true
  rollback:
    recreate: true
    cleanupOnFail: true
  values:
➜  ~ kgno -o wide
NAME    STATUS   ROLES                       AGE     VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION        CONTAINER-RUNTIME
k3s1    Ready    control-plane,etcd,master   132d    v1.29.3+k3s1   10.254.1.71   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s10   Ready    worker                      125d    v1.29.3+k3s1   10.254.1.64   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s11   Ready    worker                      125d    v1.29.3+k3s1   10.254.1.65   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s2    Ready    worker                      3h52m   v1.29.3+k3s1   10.254.1.72   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s3    Ready    control-plane,etcd,master   132d    v1.29.3+k3s1   10.254.1.73   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s4    Ready    worker                      132d    v1.29.3+k3s1   10.254.1.74   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s5    Ready    control-plane,etcd,master   132d    v1.29.3+k3s1   10.254.1.75   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s6    Ready    worker                      67m     v1.29.3+k3s1   10.254.1.76   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s7    Ready    worker                      125d    v1.29.3+k3s1   10.254.1.61   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s8    Ready    worker                      125d    v1.29.3+k3s1   10.254.1.62   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2
k3s9    Ready    worker                      125d    v1.29.3+k3s1   10.254.1.63   <none>        Debian GNU/Linux 12 (bookworm)   6.6.20+rpt-rpi-2712   containerd://1.7.11-k3s2

Logs

No response

behdadkh commented 6 months ago

Facing the exact same problem while trying to install Alloy Helm chart 0.1.1.

hainenber commented 6 months ago

It seems the similar CRD created in Loki chart is the culprit for the conflict here. We probably have to switch Loki's self-monitoring from Agent to Alloy in the long run.

I managed to dig out this workaround in another thread. Hope this helps out!

behdadkh commented 6 months ago

Thanks @hainenber this indeed helped me towards the right direction.

I could successfully install Alloy Helm 0.1.1 along Loki Helm 5.47.2, I had to also set the test.enabled to false for Loki as well. Also, manually delete the:

kubectl delete crds grafanaagents.monitoring.grafana.com podlogs.monitoring.grafana.com

Then re-syncing the Alloy (via Argocd) went through successfully.

github-actions[bot] commented 5 months ago

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it. If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue. The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity. Thank you for your contributions!

mcoreix commented 2 months ago

Facing same issue....also with the workaround from @hainenber it is not deploying

thinklinux commented 1 month ago

Facing the same issue on 0.6.0