Open Zebradil opened 1 year ago
I dug around a bit and found out that the upstream Prometheus appends an endpoint's index to a job_name
: https://github.com/prometheus-operator/prometheus-operator/blob/b5327e7e66e818b8b924332191d4888d45d270e4/pkg/prometheus/promcfg.go#L1075
Maybe it makes sense to do the same?
Hi @Zebradil - thanks for reporting.
Can you detail your use case here a bit? Are you doing closed-box monitoring, e.g. something like blackbox-exporter
in Prometheus?
Or is your target the source of different-flavored metrics, serving them on different paths of its metrics HTTP port?
If it's the former, I'm tempted to mark this as a duplicate to #91, where we're exploring the best way to support closed-box monitoring in managed collection.
A current workaround for this is as you said, creating a separate PodMonitoring per path, as mentioned here. You could also deploy a standalone GMP collector and manually configure scraping as mentioned here.
We have to be careful here as changing the job_name
to append the endpoint index can break existing users' dashboards that rely on the job
label matching the name of their PodMonitoring. As such, if we wanted to pursue your change, we'd likely want to bump the API version of the CRD as it can be considered breaking.
Does that make sense?
cc @bwplotka
Actually this may be safe as we use relabeling rules to overwrite the PodMonitoring
's name as the job
label in the operator. Taking another look...
Yea - so the biggest risk with relabeling both metric paths to the same job
is potential collisions if the same metric name and label set exists at both paths, which isn't great.
The same problem could arise if we, say, injected the endpoint index, or path as a separate label, e.g. endpoint_index:0
, or endpoint_path:/realms/master/metrics
.
Otherwise, we'd need to change the job
label, which gets back to the breaking API change I mentioned above.
Hi @pintohutch, thank you for looking into that.
Can you detail your use case here a bit?
I'm installing a Keycloack application from this helm chart. The helm chart supports Prometheus metrics and includes a ServiceMonitor resource as shown in the first comment. It is not the first time I have needed to convert a ServiceMonitor to a PodMonitoring, so I'm doing it with a small script, which follows the conversion process described in the GCP docs. But this time the conversion produced an unusable PodMonitoring resource due to duplicated job names.
I was checking the upstream Prometheus to see how such a case is handled there. There, scraping job names include indexes (e.g. ServiceMonitor/ns1/name1/0
, ServiceMonitor/ns1/name1/1
, see the implementation).
In this project, because of the relabeling, the generated job names aren't affecting the final metrics. I assumed that including indexes in a job_name won't affect anything. You can check my PR #480 for more details.
Yea - so the biggest risk with relabeling both metric paths to the same job is potential collisions if the same metric name and label set exists at both paths, which isn't great.
This is not great, agreed. I missed the point that in the upstream the job
label is not relabeled by default.
The same problem could arise if we, say, injected the endpoint index, or path as a separate label, e.g. endpoint_index:0, or endpoint_path:/realms/master/metrics.
I don't get what would be the downside of adding an endpoint_index
label.
An endpoint_path
label probably isn't a good choice, because an endpoint can emit different metrics depending on query parameters.
To summarize, I see that the proposed by me approach to naming jobs can solve my issue, but at the same time, it opens a possibility of getting conflicts between metric names-labels emitted by different endpoints, and makes it harder to notice the conflict. The requirement of having unique endpoint ports per PodMonitoring, on the other side, ensures no naming conflicts. Adding a new endpoint_index
label seems to be a good solution (until I understand why it is not 🙂).
BTW, it'd be nice to have a validation of a PodMonitoring custom resource for such a case. I spent some time looking around before I noticed errors in the collector
's logs. Ideally, a PodMonitoring resource with endpoints with the same port should not be accepted.
+1 for validation, great point.
And great to see another use case for this. Indeed some apps/exporters might require multiple scrapes due to different paths (although rare).
I think we need a small proposal for the path forward - we will look into it (we will share here). Either more supportive job names (e.g like Prometheus Operator), or more dynamic configuration for job name. Perhaps this can solve https://github.com/GoogleCloudPlatform/prometheus-engine/issues/91 in mid term too.
Is there a way to separate to different PodMonitors for now for you?
Hi @bwplotka,
Yes, at the moment I just manually create two PodMonitoring resources with different names. This is not ideal, because it increases the effort of maintaining compatibility of the upstream helm chart and its adaptation for GMP, but I'm not blocked and it's good enough.
I'm looking forward to the outcome of your discussions.
Cool - we should definitely add a validating webhook check for this case, thanks for raising!
I don't get what would be the downside of adding an endpoint_index label.
This is probably rare and maybe not that much of a concern, but I was thinking how to handle if the target already emitted timeseries with an endpoint_index
label.
Assuming we'd use metric_relabel_configs
, and since we leave honor_labels=false
, Prometheus would relabel the target's label to endpoint_index_exported
, and use our endpoint_index
label as the canonical one.
In that case, if that label is used at all in dashboarding or alerting, it could cause issues.
Though, we may be able to caveat that and not overwrite if it's already present, e.g:
# Prime the endpoint_index assuming we can add it safely.
- target_label: __tmp_endpoint_index
replacement: 0 # <- determined from index in the PodMonitoring
# Overwrite it with any existing label.
- source_labels: [endpoint_index]
target_label: __tmp_endpoint_index
# Final relabeling.
- source_labels: [_tmp_endpoint_index]
target_label: endoint_index
But I'm unsure if that's causing more or less problems 😅 .
Just stumbled upon this issue while migrating a RabbitMQ ServiceMonitor
to a PodMonitoring
. Was there any progress since end of May? As a reference, this is the ServiceMonitor spec.endpoints
:
- interval: 15s
port: prometheus
scheme: http
scrapeTimeout: 1s
- interval: 15s
params:
family:
- queue_coarse_metrics
- queue_metrics
path: /metrics/detailed
port: prometheus
scheme: http
scrapeTimeout: 1s
GMP rightfully complains about identically named job_name
keys.
Is the best workaround still to create different PodMonitoring
s?
Hi @m3adow - yes indeed the best workaround for now is to use different PodMonitoring
s.
We haven't had a lot of recent demand for this feature so we've deprioritized it for now. If you have some use cases you'd like to promote here, please let us know.
I convert a ServiceMonitor resource rendered from a helm chart into a PodMonitoring resource according to the GMP documentation.
The initial ServceMonitor resource
```yaml apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: # Removed #annotations: labels: app.kubernetes.io/component: keycloak app.kubernetes.io/instance: keycloak app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: keycloak helm.sh/chart: keycloak-15.1.1 name: keycloak namespace: keycloak spec: endpoints: - interval: 30s path: /metrics port: http - interval: 30s path: /realms/master/metrics port: http namespaceSelector: matchNames: - keycloak selector: matchLabels: app.kubernetes.io/component: metrics app.kubernetes.io/instance: keycloak app.kubernetes.io/name: keycloak ```
The resulting PodMonitoring resource
```yaml apiVersion: monitoring.googleapis.com/v1 kind: PodMonitoring metadata: # Removed #annotations: labels: app.kubernetes.io/component: keycloak app.kubernetes.io/instance: keycloak app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: keycloak helm.sh/chart: keycloak-15.1.1 name: keycloak namespace: keycloak spec: endpoints: - interval: 30s path: /metrics port: http - interval: 30s path: /realms/master/metrics port: http selector: matchLabels: app.kubernetes.io/instance: keycloak app.kubernetes.io/name: keycloak ```
The difference
```diff --- servicemonitor-keycloak.yaml 2023-05-19 13:43:45.708860839 +0200 +++ podmonitoring-keycloak.yaml 2023-05-19 13:47:15.243825855 +0200 @@ -1,31 +1,27 @@ -apiVersion: monitoring.coreos.com/v1 -kind: ServiceMonitor +apiVersion: monitoring.googleapis.com/v1 +kind: PodMonitoring metadata: # Removed #annotations: labels: app.kubernetes.io/component: keycloak app.kubernetes.io/instance: keycloak app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: keycloak helm.sh/chart: keycloak-15.1.1 name: keycloak namespace: keycloak spec: endpoints: - interval: 30s path: /metrics port: http - interval: 30s path: /realms/master/metrics port: http - namespaceSelector: - matchNames: - - keycloak selector: matchLabels: - app.kubernetes.io/component: metrics app.kubernetes.io/instance: keycloak app.kubernetes.io/name: keycloak ```
After applying the PodMonitoring resource to my environment I see the following errors in collector pods:
The most interesting here seems to be this error:
I assume that the logic of generating job names doesn't consider endpoint paths, and there must be no endpoints with the same port name. Is that correct?
As a workaround, I'll create one PodMonitoring resource per port.