grafana / agent

Vendor-neutral programmable observability pipelines.
https://grafana.com/docs/agent/
Apache License 2.0
1.6k stars 487 forks source link

CRD `grafanaagents.monitoring.grafana.com` cannot be deployed because of annotation length #4379

Closed dangmai closed 9 months ago

dangmai commented 1 year ago

What's wrong?

I've been using Grafana Agent Helm chart for a while now successfully, but it looks like the latest version can't be deployed because of the annotation length.

Full error is: one or more objects failed to apply, reason: CustomResourceDefinition.apiextensions.k8s.io "grafanaagents.monitoring.grafana.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes (retried 5 times).

Steps to reproduce

Deploy the grafana-agent-operator version 0.2.16 using Helm and ArgoCD with this chart:

apiVersion: v2
name: grafana

type: application
version: 1.0.0
appVersion: "1.0.0"

dependencies:
  - name: grafana-agent-operator
    version: 0.2.16
    repository: https://grafana.github.io/helm-charts

System information

k3s v1.27.2+k3s1, Linux 5.15.0-73-generic

Software version

Grafana Agent Operator 0.2.16

Configuration

# This is my `values.yaml` file used for this Helm chart
grafana-agent-operator:
  kubeletService:
    namespace: grafana
    serviceName: kubelet

Logs

No response

eduanb commented 1 year ago

@dangmai I had the same issue. A workaround is to use server-side apply https://kubernetes.io/docs/reference/using-api/server-side-apply/

dangmai commented 1 year ago

@eduanb Thank you for pointing me to that, it leads me down the rabbit hole researching about this issue, and it looks like other charts have run into this before, most prominently this one: https://github.com/prometheus-community/helm-charts/issues/1500 There are some workarounds there for different tools (ArgoCD, pulumi, etc.).

The exact workaround I used is this one: https://github.com/prometheus-community/helm-charts/issues/1500#issuecomment-1132907207

I'll leave it up to the maintainers to decide whether this is a bug or not.

eduanb commented 1 year ago

For reference, here is my Argo app that worked and includes the server-side apply

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: grafana-agent-operator
  namespace: grafana-agent
spec:
  destination:
    namespace: grafana-agent
    server: https://kubernetes.default.svc
  source:
    repoURL: https://grafana.github.io/helm-charts
    targetRevision: 0.2.16
    helm:
      releaseName: grafana-agent-operator
    chart: grafana-agent-operator
    syncOptions:
#      Need server side apply because the resource is too big to fit in 262144 bytes allowed annotation size
      - ServerSideApply=true
d-rk commented 1 year ago

as a workaround the descriptions can be deleted from the CRD:

yq -i eval 'del(.. | .description?)' monitoring.grafana.com_grafanaagents.yaml
lg-d commented 1 year ago

+1 Just came across this problem and is an annoyance to fix.

FabianKnapp commented 1 year ago

+1 Please fix this!

hackwish commented 1 year ago

+1 here. The last functional/deployable version of this CR is the v0.33.2.

FredrikAugust commented 1 year ago

Thanks, @hackwish!

For those of us using helm charts that corresponds to 0.2.15 which runs v0.32.1.

blodone commented 1 year ago

also came accross this issue. it's annoying. seems like it's caused by the last applied configuration is stored with apply

kubectl create -f ... is a workaround for this

aptomaKetil commented 1 year ago

Another workaround is to use server-side apply, i.e. kubectl apply -f ... --server-side.

mjacobsson commented 1 year ago

+1 Please fix. very annoying :(

MrAbaddon commented 1 year ago

+1 Please fix.

mecostav commented 1 year ago

+1 Please fix. This is a most welcome fix

rfratto commented 1 year ago

Please don't comment with +1; comments aren't how we prioritize issues and it's only causing more noise in our notifications.

Instead, please thumbs up the original issue. I'll be marking the +1 comments as off-topic so the conversation is more focused.

tomhobson commented 10 months ago

Hi all,

https://github.com/grafana/agent/blob/main/tools/generate-crds.bash

is it possible to create a lightweight agent and remove the descriptions using @d-rk 's suggestion?

yq -i eval 'del(.. | .description?)' monitoring.grafana.com_grafanaagents.yaml

Something like this

generate-lightweight-crds.bash

#!/usr/bin/env bash

ROOT=$(git rev-parse --show-toplevel)

# Generate objects and controllers for our CRDs
cd $ROOT/pkg/operator/apis/monitoring/v1alpha1
controller-gen object paths=.
controller-gen crd:crdVersions=v1 paths=. output:crd:dir=$ROOT/production/operator/crds

# Generate CRDs for prometheus-operator.
PROM_OP_DEP_NAME="github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1"
PROM_OP_DIR=$(go list -f '{{.Dir}}' $PROM_OP_DEP_NAME)

cd $PROM_OP_DIR
controller-gen crd:crdVersions=v1 paths=.  output:crd:dir=$ROOT/production/operator/crds

# Remove known Prometheus-Operator CRDS we don't generate. (An allowlist would
# be better here, but rfratto's bash skills are bad.)
rm -f $ROOT/production/operator/crds/monitoring.coreos.com_alertmanagers.yaml
rm -f $ROOT/production/operator/crds/monitoring.coreos.com_prometheuses.yaml
rm -f $ROOT/production/operator/crds/monitoring.coreos.com_prometheusrules.yaml
rm -f $ROOT/production/operator/crds/monitoring.coreos.com_thanosrulers.yaml

# Remove all descriptions from the generated crds to create lightweight versions
yq -i eval 'del(.. | .description?)' $ROOT/production/operator/crds/*.yaml

within a CI/CD / GitOps flow it's really hard to do these kinds of workarounds, you essentially need to carve a big change inside of your CI/CD flow.

I think that if you're doing CI/CD that way, you don't really need the descriptions on the agent spec, it might be useful if you're developing against the agent, but for most people they're deploying the grafana agent without really touching any internals or custom deploying the grafana stack with an agent.

It's a bit hacky, I prefer resources to be completely generated and not touched afterwards, but this could be a route that makes sense to go down and it gets around the byte issue and you seem to be manually editing the crds post generation anyways. So long as it's well documented?