Closed mjnagel closed 2 months ago
This issue is currently awaiting triage.
If metrics-server contributors determine this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
@mjnagel if you set the serviceMonitor.enabled
value to true
on a cluster without the ServiceMonitor
CRD installed the expected behaviour is to error and fail the apply. Helm is a promise based system and if it can't keep a promise it should error rather than silently break the promise.
The use of capabilities, especially non core APIs, should be used sparingly as they can behave incorrectly in systems where the target Kubernetes API isn't connected to Helm. So this is the reason why the Helm idiom isn't to use capabilities paired with the fail
function around resources where the API might not be present, and instead fail at apply time.
The TL;DR here is that the cluster operator should know the spec of the cluster and set the Helm values accordingly.
@stevehipwell I think that's totally fair, although I've definitely seen both patterns followed (fail vs "safety net") depending on the helm chart. I've seen this specific capability check (servicemonitor) on a number of other charts I use (fluentbit and loki are two that jump to mind), although there are definitely others I use (to include metrics-server) that don't have it. Totally understand if the perspective from the maintainers of this chart is to error rather than silently resolve - there are certainly other ways to accomplish similar behavior for handling different environments (layered values files, etc).
Just to throw two other options out there and see if there is still a route to make this work:
{{- if and (or .Values.serviceMonitor.enabled (.Capabilities.APIVersions.Has "monitoring.coreos.com/v1")) .Values.metrics.enabled -}}
. It's more complex templating to be sure, but would still allow/force failure if someone explicitly enables the servicemonitor and doesn't have the capability. But it would also work for my use case where basically all I want is the servicemonitor to auto-enable based on the cluster.@mjnagel I don't see any benefit to changing the current conditional. It checks that you want metrics enabled and that you also want to scrape them with a ServiceMonitor
before running the template.
FYI I'd have removed the capability check from the Fluent Bit chart if I wasn't working on replacement charts.
Thanks for the dialogue - really just comes down to making it more dynamic for me vs doing explicitly what is asked in the values. Totally see the perspective of treating values as a promise!
What would you like to be added:
Currently the servicemonitor in the chart is conditional on two enabled values. I think it would help to add a helm capabilities check be added to validate that the CRD for servicemonitor is present.
Why is this needed:
The current conditionals are helpful, but when deploying the same config across dev/test/staging/prod environment I may have a subset of applications on my cluster. This might not always include monitoring/prometheus so it is useful to be able to enable metrics but not have the helm install fail. This appears to be a common pattern in a lot of helm charts leveraging prometheus CRs.
/kind feature