open-telemetry / opentelemetry-helm-charts

OpenTelemetry Helm Charts
https://opentelemetry.io
Apache License 2.0
391 stars 474 forks source link

[operator] Upgrading to v0.57.0 and above #1184

Open TylerHelmuth opened 4 months ago

TylerHelmuth commented 4 months ago

The OpenTelemetry Operator had a significant release in version 0.99.0 that includes a new version of the OpenTelemetryCollector CRD. This includes a conversion webhook, which needs to reference a namespaced webhook Service, and therefore the CRD needs to include the release namespace.

For upgrading see https://github.com/open-telemetry/opentelemetry-helm-charts/blob/main/charts/opentelemetry-operator/UPGRADING.md#0560-to-0570.

For the original discussion, see https://github.com/open-telemetry/opentelemetry-helm-charts/issues/1167

sumeet-zuora commented 4 months ago

Seems after upgrade my collectors are failing as the CRD was not upgraded

Helm upgrade failed: resource mapping not found for name: "elastic-apm" namespace: "" from "": no matches for kind "OpenTelemetryCollector" in version "opentelemetry.io/v1beta1" ensure CRDs are installed first
TylerHelmuth commented 4 months ago

@sumeet-zuora the beta CRDs are not yet managed by the helm chart. We first need to merge https://github.com/open-telemetry/opentelemetry-helm-charts/pull/1176. If you are using beta Operator CRDs with helm chart version 0.57.0 you'll need to manage the CRDs yourself. The helm chart will start managing the beta CRDs with version 0.58.0

diranged commented 4 months ago

Hey ... so the change at https://github.com/open-telemetry/opentelemetry-helm-charts/pull/1175 is pretty painful for people who are running ct to do chart-testing... because it becomes impossible to install the opentelemetry-operator chart at the same time as creating resources that use the CRDs defined in that chart - CT/Helm complain that the CRD doesn't exist yet.

I understand the motivation of putting the CRDs into the templates - but I find that is almost always a painful way to solve the issue.. I think it'd make more senes for the operator to find the CRD and patch it with the admission controller settings as necessary.

Either way .. if the chart exposed the ability to set annotations on the CRDs, then we can use helm hooks to define that the CRDs are installed first, which I think at least helps us move forward in a more sane way.

TylerHelmuth commented 4 months ago

because it becomes impossible to install the opentelemetry-operator chart at the same time as creating resources that use the CRDs defined in that chart

I thought this was impossible before as well, unless you set some fail policies on the webhooks. This issue is a major reason why we are adding the opentelemetry-kube-stack chart. How were you managing this?

Since this chart does not manage CRs we have not factored in this use case. Can you go into more details?

swiatekm commented 4 months ago

Allowing annotations to be added to the CRDs is at least straightforward, even if it will introduce other problems due to the way Helm manages hooks. But if you want to include the operator chart as a dependency and define CRs, then that's probably the most straightforward fix for now.

EDIT: I don't think the above works. Helm will complain that the resource kind does not exist even if you mark it as a pre-install hook, and even disable openapi validation.

I think it'd make more senes for the operator to find the CRD and patch it with the admission controller settings as necessary.

This sounds like quite the can of worms. Is there any major operator that does this successfully?

swiatekm commented 4 months ago

Worth noting that even if you solve the CRD problem, applying v1alpha1 CRDs while also installing the operator will not work, as the conversion webhook is necessary for that, and it only works while the operator is running. If you have control over what CRs get created on install, then you could just convert the manifests to v1beta1 and ship your own CRD without the conversion webhook.