Closed nlamirault closed 1 year ago
Encountered the same issue today. Seems to be related to some CRD changes: https://github.com/prometheus-operator/prometheus-operator/issues/5197 .
Maybe need to populate this struct with some defaults?
From the issue linked, it seems updating your prometheus service monitor CRDs in cluster may be the resolution https://github.com/prometheus-operator/prometheus-operator/issues/5197#issuecomment-1446150799
Please let me know if that works :)
I still have the error after applied theses manifests :
- https://raw.githubusercontent.com/prometheus-community/helm-charts/kube-prometheus-stack-46.6.0/charts/kube-prometheus-stack/crds/crd-alertmanagerconfigs.yaml
- https://raw.githubusercontent.com/prometheus-community/helm-charts/kube-prometheus-stack-46.6.0/charts/kube-prometheus-stack/crds/crd-alertmanagers.yaml
- https://raw.githubusercontent.com/prometheus-community/helm-charts/kube-prometheus-stack-46.6.0/charts/kube-prometheus-stack/crds/crd-podmonitors.yaml
- https://raw.githubusercontent.com/prometheus-community/helm-charts/kube-prometheus-stack-46.6.0/charts/kube-prometheus-stack/crds/crd-probes.yaml
- https://raw.githubusercontent.com/prometheus-community/helm-charts/kube-prometheus-stack-46.6.0/charts/kube-prometheus-stack/crds/crd-prometheuses.yaml
- https://raw.githubusercontent.com/prometheus-community/helm-charts/kube-prometheus-stack-46.6.0/charts/kube-prometheus-stack/crds/crd-prometheusrules.yaml
- https://raw.githubusercontent.com/prometheus-community/helm-charts/kube-prometheus-stack-46.6.0/charts/kube-prometheus-stack/crds/crd-servicemonitors.yaml
- https://raw.githubusercontent.com/prometheus-community/helm-charts/kube-prometheus-stack-46.6.0/charts/kube-prometheus-stack/crds/crd-thanosrulers.yaml
my Prometheus object have theses values:
evaluationInterval: 30s
scrapeInterval: 30s
Same here, using most recent ServiceMonitor/PodMonitor CRDs from prometheus-operator repository.
I'm temporarily running my own build with following change applied and it seems to work. I'm not sure if this is a proper fix however:
generator, err := prometheus.NewConfigGenerator(log.NewNopLogger(), &monitoringv1.Prometheus{
Spec: monitoringv1.PrometheusSpec{
CommonPrometheusFields: monitoringv1.CommonPrometheusFields{
ScrapeInterval: "30s",
},
},
}, true) // TODO replace Nop?
Seeing the same after updating to latest TA (0.78.0
). I think it's coming from the scrape time validation logic, in which we now have empty durations, since these are now coming from the Prometheus object passed to config generator and are empty.
I think this is right @eplightning, I was about to open a PR, would you like to instead?
@matej-g Please go ahead, I'm busy with something else at the moment.
I'm wondering if it's not due to this line since v0.78.0: https://github.com/open-telemetry/opentelemetry-operator/blob/70f22dc199dd328a70f314e323f95910f8686d3d/cmd/otel-allocator/go.mod#L21, in v0.77.0 it used to be https://github.com/open-telemetry/opentelemetry-operator/blob/0b26cbbfe904281713165235032a2fa351ebfbba/cmd/otel-allocator/go.mod#L21 Prometheus v0.43.0 is very very old. I tried to revert to v0.77.0 but it looks like it does not support Kubernetes 1.27.
@mcanevet Prometheus v0.43.0 was released in March 2023, whereas the last one we were on was from 2021.
Ok, then I must be wrong
no worries, prometheus version is incredibly aggravating – it seems like they publish v1s and v2s and then retract them once a month!
Sorry for the delay, in the end I could not exactly pin point where the failure occurred until few more tests (it's actually on the unmarshalling step). Although the fix worked, I wanted to understand where the failure was occurring. See the PR: https://github.com/open-telemetry/opentelemetry-operator/pull/1822
Thanks 🙇
Same happening here
Thank you @matej-g for the research! I would suggest to merge this workaround until the real root cause is found?
Some suggestions to the code for avoiding to hardcode params
with same configuration and with v0.80.0
release, i've got these errors :
{"level":"info","ts":"2023-07-04T15:30:51Z","logger":"opentelemetrycollector-resource","msg":"default","name":"traces"}
{"level":"info","ts":"2023-07-04T15:30:51Z","logger":"opentelemetrycollector-resource","msg":"validate update","name":"traces"}
{"level":"error","ts":"2023-07-04T15:32:28Z","logger":"controllers.OpenTelemetryCollector","msg":"failed to reconcile config maps","error":"failed to parse config: no scrape_configs available as part of the configuration","stacktrace":"github.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).RunTasks\n\t/workspace/controllers/opentelemetrycollector_controller.go:229\ngithub.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).Reconcile\n\t/workspace/controllers/opentelemetrycollector_controller.go:211\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"}
{"level":"error","ts":"2023-07-04T15:32:28Z","msg":"Reconciler error","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","OpenTelemetryCollector":{"name":"metrics","namespace":"opentelemetry"},"namespace":"opentelemetry","name":"metrics","reconcileID":"e53f6432-cbf5-49df-8022-bf1eafa36138","error":"failed to parse config: no scrape_configs available as part of the configuration","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"}
with same configuration and with
v0.80.0
release, i've got these errors :{"level":"info","ts":"2023-07-04T15:30:51Z","logger":"opentelemetrycollector-resource","msg":"default","name":"traces"} {"level":"info","ts":"2023-07-04T15:30:51Z","logger":"opentelemetrycollector-resource","msg":"validate update","name":"traces"} {"level":"error","ts":"2023-07-04T15:32:28Z","logger":"controllers.OpenTelemetryCollector","msg":"failed to reconcile config maps","error":"failed to parse config: no scrape_configs available as part of the configuration","stacktrace":"github.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).RunTasks\n\t/workspace/controllers/opentelemetrycollector_controller.go:229\ngithub.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).Reconcile\n\t/workspace/controllers/opentelemetrycollector_controller.go:211\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"} {"level":"error","ts":"2023-07-04T15:32:28Z","msg":"Reconciler error","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","OpenTelemetryCollector":{"name":"metrics","namespace":"opentelemetry"},"namespace":"opentelemetry","name":"metrics","reconcileID":"e53f6432-cbf5-49df-8022-bf1eafa36138","error":"failed to parse config: no scrape_configs available as part of the configuration","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"}
Hi there, you have to set at least one static config on scrape_configs for the prometheus receiver.Commonly it is set to scrape the own collector metrics. With it, it will work
i use a ServiceMonitor to scrape the collectors metrics. So i would not like to add the scraping of the own instance
kubectl -n opentelemetry get servicemonitor
NAME AGE
opentelemetry-operator 25d
opentelemetry-collector-logs 149m
opentelemetry-collector-metrics-targetallocator 149m
opentelemetry-collector-metrics 149m
opentelemetry-collector-traces 149m
i use a ServiceMonitor to scrape the collectors metrics. So i would not like to add the scraping of the own instance
kubectl -n opentelemetry get servicemonitor NAME AGE opentelemetry-operator 25d opentelemetry-collector-logs 149m opentelemetry-collector-metrics-targetallocator 149m opentelemetry-collector-metrics 149m opentelemetry-collector-traces 149m
I underdtand but for the target allocator, Prom library is used and then over it some tweaks. A workaround until this discussion is done is the following ;)
# Ref: https://github.com/open-telemetry/opentelemetry-operator/issues/1811
# Ref: https://github.com/open-telemetry/opentelemetry-operator/pull/1822/files
# Ref: https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/14597
prometheus:
config:
global:
evaluation_interval: 60s
scrape_interval: 60s
scrape_timeout: 60s
scrape_configs:
- job_name: dummy
static_configs:
- targets:
- 127.0.0.1:8888
# Query for a list of jobs to target allocator or compatible endpoint
target_allocator:
collector_id: ${POD_NAME}
endpoint: http://global-collector-targetallocator.open-telemetry-collector.svc:80
interval: 30s
http_sd_config:
refresh_interval: 60s
The dummy scrape config make it work. After it, may be fine to open another issue to discuss about this?
thanks i will try that...
Hi, i've got these logs at startup:
with this configuration: