Open nrvnrvn opened 5 years ago
You're absolutely right, the documentation is faulty here. The targetPort
field doesn't refer to the Service
s fields, it refers to the port name or port number defined on the underlying selected Pods. Clarifying this in the docs is step 1.
Step 2 as you also already mentioned, we should treat port
as a IntOrStr just like targetPort
and select either the port name or port number, in the same fashion. And in order to do that, Prometheus should expose the __meta_kubernetes_endpoint_port_number
meta label. Would you like to go ahead and open this issue? If not I'm happy to do it, but as you discovered it, it's yours to take :slightly_smiling_face: .
what were the initial considerations for using __meta_kubernetes_endpoint_port instead of __meta_kubernetes_serviceport? I assume this was done because endpoints contain actual target information but I still ask this question because ServiceMonitor as we assumed should have grabbed Port number/name from Services.
This is because the Endpoints object is what Prometheus actually uses for discovery purposes, and a Service
is not necessarily always present. So going with the Endpoints object whenever possible is the safe bet.
This issue has been automatically marked as stale because it has not had any activity in last 60d. Thank you for your contributions.
/reopen
/cc @brancz
This issue has been automatically marked as stale because it has not had any activity in last 60d. Thank you for your contributions.
Is there something left to be answered?
The issue is still unresolved if I am not mistaken
This issue has been automatically marked as stale because it has not had any activity in last 60d. Thank you for your contributions.
I believe the docs still need to be adapted here.
This issue has been automatically marked as stale because it has not had any activity in last 60d. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had any activity in last 60d. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had any activity in last 60d. Thank you for your contributions.
Please remove the stale label again.
Contributions on improving the docs would be very welcome! :)
@brancz @nrvnrvn
If I don't specify a container port, then use targetPort
in serviceMonitor
and point to right port of container (not specify a container port but truly existed), could it work?
yes that should work, when you specify the actual port number
@brancz
Based on my case, it's not work until I specify container port
in my pod, see the details here:
https://github.com/coreos/prometheus-operator/issues/3354
Guys, I'm getting an issue to scrape metrics via PodMonitor and I found this issue, so, I would like to know if it has something to do with it.
I'm defining a PodMonitor like this to scrape some openebs pods metrics:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: openebs-cstor-pool-metrics
namespace: openebs
spec:
selector:
matchLabels:
app: cstor-pool
podMetricsEndpoints:
- port: "9500"
interval: 10s
The problem is that the generated prometheus.yaml has the bellow content. It isn't working properly. Prometheus isn't being able to scrape any metrics because it can't find any targets.
- job_name: openebs/openebs-cstor-pool-metrics/0
honor_labels: false
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- openebs
scrape_interval: 10s
relabel_configs:
- action: keep
source_labels:
- __meta_kubernetes_pod_label_app
regex: cstor-pool
- action: keep
source_labels:
- __meta_kubernetes_pod_container_port_name
regex: "9500"
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- target_label: job
replacement: openebs/openebs-cstor-pool-metrics
- target_label: endpoint
replacement: "9500"
But now, if I change the PodMonitor to the bellow one, it works:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: openebs-cstor-pool-metrics
namespace: openebs
spec:
selector:
matchLabels:
app: cstor-pool
podMetricsEndpoints:
- targetPort: 9500
interval: 10s
Note that I changed from port to targetPort (and I informed an integer in place of a string). This was the generated prometheus.yaml:
- job_name: openebs/openebs-cstor-pool-metrics/0
honor_labels: false
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- openebs
scrape_interval: 10s
relabel_configs:
- action: keep
source_labels:
- __meta_kubernetes_pod_label_app
regex: cstor-pool
- action: keep
source_labels:
- __meta_kubernetes_pod_container_port_number # THIS IS DIFFERENT
regex: "9500"
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- target_label: job
replacement: openebs/openebs-cstor-pool-metrics
- target_label: endpoint
replacement: "9500"
The only difference between the two prometheus.yaml generated is the following:
diff /tmp/prometheus.yaml /tmp/prometheus_2.yaml
1056c1056
< - __meta_kubernetes_pod_container_port_name
---
> - __meta_kubernetes_pod_container_port_number
My concern about it is that targetPort was deprecated by https://github.com/prometheus-operator/prometheus-operator/pull/3078. But by using port in place of targetPort doesn't works probably because I'm not using a name for the port, because that openebs pods doesn't has a name for it.
@galindro I have the exact same issue. I can't seem to configure a way to reach an unnamed port from a PodMonitor. That seems like a massive oversight. I'm not always in control of the naming of ports on pods. I should be able to configure scraping at an arbitrary numerical port. A workaround would be useful. I'm not sure what that would look like.
Well, your case is even worse... I have no clues how to help you to workaround it. Sorry
My workaround in this case has been to use the (deprecated) targetPort
field, to point to the arbitrary numerical port on the pod.
Note that this works:
podMetricsEndpoints:
- targetPort: 8080
While this doesn't:
podMetricsEndpoints:
- targetPort: "8080"
.. presumably because by encapsulating the port number in quotes, you're instructing Prometheus to look for a container port named "8080".
Also worth noting that the above workaround doesn't work when using port: 8080
, but errors with:
* spec.podMetricsEndpoints.port: Invalid value: "integer": spec.podMetricsEndpoints.port in body must be of type string: "integer
*
For the situation that the Pod does not declare any ports (as mentioned by @bricef), a workaround can be implemented as in the following example:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: rook-ceph-operator
spec:
selector:
matchLabels:
app: rook-ceph-operator
namespaceSelector:
matchNames:
- rook-ceph
podMetricsEndpoints:
- relabelings:
- action: replace
targetLabel: __address__
sourceLabels:
- __address__
replacement: $1:8080
That means, instead of defining the port to scrape with port
or targetPort
, you manually redefine the __address__
label in the relabelings
field to include the desired target port.
That's a neat trick!
This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.
Stay away @stale
This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.
Please remove the stale label again...
For the situation that the Pod does not declare any ports (as mentioned by @bricef), a workaround can be implemented as in the following example:
apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: rook-ceph-operator spec: selector: matchLabels: app: rook-ceph-operator namespaceSelector: matchNames: - rook-ceph podMetricsEndpoints: - relabelings: - action: replace targetLabel: __address__ sourceLabels: - __address__ replacement: $1:8080
That means, instead of defining the port to scrape with
port
ortargetPort
, you manually redefine the__address__
label in therelabelings
field to include the desired target port.
If container port
is already declared. Use regex to replace only the port.
- relabelings:
- action: replace
targetLabel: __address__
sourceLabels:
- __address__
regex: (.*):.*
replacement: "${1}:9091"
This should all be validated by the Operator. It took me 4 hours to try every combination of namespaces, labels, port and targetPort to figure out a working solution to create a working ServiceMonitor. If it just doesn't work following the official and inofficial tutorial and every question on the internet about the topic, you have to try everything.
So apparently one needs to specify targetPort
, but it's mentioned nowhere, I've even seen it deprecated in some docs and it is according to post in this issue. targetPort
needs to be an integer. I also specified port
which is the name of the port, but a number as a string is accepted as well which is a guarantee for trouble, especially if docs like https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md don't mention anything about this very particular combination of port
and targetPort
. I tried about a hundred combinations in the last hours and I'm pretty sure it's the only way to make this work, but it's really so much easier if it's just documented by the devs or validated with intuitive error messages. It should not be possible to make any mistakes regarding this.
It's neat concept that takes about 20 minutes with intuitive docs and about 2 minutes if the operator or Prometheus would give any feedback at all.
What did you do? We spent a huge amount of time trying to understand why we could not scrape a newly added service metrics until we discovered the following:
Exposing a port here gives the system additional information about the network connections a container uses, but is primarily informational
as stated in the API docs.Name of the service port this endpoint refers to. Mutually exclusive with targetPort.
Name or number of the target port of the endpoint. Mutually exclusive with port.
targetPort
equal to the number of yourService
'sport
.What did you expect to see? Metrics What did you see instead? Under which circumstances? Instead we discovered that ServiceMonitor's targetPort actually transforms to
__meta_kubernetes_pod_container_port_number
which means that Prometheus is told to scrape not the Endpoints resources but the Container resources. See https://github.com/coreos/prometheus-operator/blob/v0.28.0/pkg/prometheus/promcfg.go#L378-L399 This creates a lot of confusion... EnvironmentPrometheus Operator version: v0.28.0@sha256:62c5fe0246d88746001f071bfc764d3480dc1fc6516af52bde9d2416778dc843
Kubernetes version information:
Kubernetes cluster kind: GKE
Manifests:
Proposal
Documentation should be amended to clearly state how the []Endpoints'
port
andtargetPort
work.Since transforming the
targetPort
to__meta_kubernetes_pod_container_port_number
/__meta_kubernetes_pod_container_port_name
has been there for a long while a lot of users rely on this and this should not be removed for backwards compatibility.Instead
__meta_kubernetes_endpoint_port_number
should be taken into consideration as well.P.S. While I was writing all the above I haven't yet dug deeply into Prometheus itself but at first glance it seems like this is an upstream issue since I don't see
__meta_kubernetes_endpoint_port_number
hereTwo questions here then:
__meta_kubernetes_endpoint_port*
instead of__meta_kubernetes_service_port_*
? I assume this was done because endpoints contain actual target information but I still ask this question because ServiceMonitor as we assumed should have grabbed Port number/name from Services.