Open wangxf1987 opened 2 weeks ago
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by:
To complete the pull request process, please assign poor12 after the PR has been reviewed.
You can assign the PR to them by writing /assign @poor12
in a comment when ready.
The full list of commands accepted by this bot can be found here.
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 53.07%. Comparing base (
3232c52
) to head (a4d9be4
). Report is 13 commits behind head on master.
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
/assign @chaosi-zju
Hi! @wangxf1987 Glad to see your contribution!
The CI of this PR failed due to it wasn't signed off, usually please use git commit -s -m 'your message '
or git commit -m ' Signed-off-by: AuthorName <authoremail@example.com> \n <other message> '
to pass DCO.
Detail guideline can refer to: https://probot.github.io/apps/dco/
Hi! @wangxf1987 Glad to see your contribution!
The CI of this PR failed due to it wasn't signed off, usually please use
git commit -s -m 'your message '
orgit commit -m ' Signed-off-by: AuthorName <authoremail@example.com> \n <other message> '
to pass DCO.Detail guideline can refer to: https://probot.github.io/apps/dco/
done
/lgtm
Can you write the release-notes? @wangxf1987
Can you write the release-notes? @wangxf1987 @chaunceyjiang Ok, i should modify the release notes for that version, is it https://github.com/karmada-io/karmada/blob/master/docs/CHANGELOG/CHANGELOG-0.10.md ?
@wangxf1987 Here. In your PR description
Expose the default port for the karmada-controller-manager, scheduler and agent when creating a PodMonitor.
done
Is this ServiceMonitor and PodMonitor?
By the way, this PR focuses on exposing ports in the helm chart template, what about operators and cli? Should we keep them consistent? cc @chaosi-zju
Let me talk about my understanding of this PR:
First, our karmada-controller-manager
does listened 8080
port as metrics querying API, we defined it in https://github.com/karmada-io/karmada/blob/e6c92d5cba0edfa81e920bfff98db1b4be257d82/cmd/controller-manager/app/controllermanager.go#L161
we can try as:
$ kubectl --context karmada-host exec -it karmada-controller-manager-cdbbf75dd-69fxb -n karmada-system -- sh
/ # curl http://127.0.0.1:8080/metrics
.....
work_sync_workload_duration_seconds_bucket{result="success",le="0.512"} 4
work_sync_workload_duration_seconds_bucket{result="success",le="1.024"} 4
work_sync_workload_duration_seconds_bucket{result="success",le="2.048"} 4
work_sync_workload_duration_seconds_bucket{result="success",le="+Inf"} 4
work_sync_workload_duration_seconds_sum{result="success"} 0.075089723
work_sync_workload_duration_seconds_count{result="success"} 4
even without this PR, we can still access /metrics
API of karmada-controller-manager
from other pod, just like:
$ kubectl --context karmada-host exec -it karmada-descheduler-7d654cbff6-5jmrt -n karmada-system -- sh
# here, `10.244.0.12` is the pod IP of `karmada-controller-manager`.
/ # curl http://10.244.0.12:8080/metrics
.....
work_sync_workload_duration_seconds_bucket{result="success",le="0.512"} 4
work_sync_workload_duration_seconds_bucket{result="success",le="1.024"} 4
work_sync_workload_duration_seconds_bucket{result="success",le="2.048"} 4
work_sync_workload_duration_seconds_bucket{result="success",le="+Inf"} 4
work_sync_workload_duration_seconds_sum{result="success"} 0.075089723
work_sync_workload_duration_seconds_count{result="success"} 4
So, the normal operation of the /metrics
API does not originally rely on the content proposed by this PR.
However, @wangxf1987 has needs to use Prometheus
to collect metrics, as described in this tutorial:
https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md#using-podmonitors
take PodMonitor
as an example:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
podMetricsEndpoints:
- port: web
In practice, the spec.selector
label tells Prometheus which Pods should be scraped, while spec.podMetricsEndpoints
defines an endpoint serving Prometheus metrics to be scraped by Prometheus
.
Pay attention to spec.podMetricsEndpoints
:
the official API recommends to use port
field, which represents Name of the Pod port which this endpoint refers to
, instead of deprecated filed targetPort
which represents Name or number of the target port of the Pod object behind the Service
.
So, here comes the problem, we didn't specify the port
field in container description of karmada-controller-manager
and give it a name, which @wangxf1987 can reference it in spec.podMetricsEndpoints
of PodMonitor
.
this is the reason for why this PR proposed.
Hi @wangxf1987, am I right?
By the way, this PR focuses on exposing ports in the helm chart template, what about operators and cli? Should we keep them consistent?
If we all agree on the necessity of this PR, then we should build an issue to make up for the corresponding modifications of other installation methods.
Is this ServiceMonitor and PodMonitor?
By the way, this PR focuses on exposing ports in the helm chart template, what about operators and cli? Should we keep them consistent? cc @chaosi-zju
Expose this port for PodMonitor only
Let me talk about my understanding of this PR:
First, our
karmada-controller-manager
does listened8080
port as metrics querying API, we defined it inwe can try as:
$ kubectl --context karmada-host exec -it karmada-controller-manager-cdbbf75dd-69fxb -n karmada-system -- sh / # curl http://127.0.0.1:8080/metrics ..... work_sync_workload_duration_seconds_bucket{result="success",le="0.512"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="1.024"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="2.048"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="+Inf"} 4 work_sync_workload_duration_seconds_sum{result="success"} 0.075089723 work_sync_workload_duration_seconds_count{result="success"} 4
even without this PR, we can still access
/metrics
API ofkarmada-controller-manager
from other pod, just like:$ kubectl --context karmada-host exec -it karmada-descheduler-7d654cbff6-5jmrt -n karmada-system -- sh # here, `10.244.0.12` is the pod IP of `karmada-controller-manager`. / # curl http://10.244.0.12:8080/metrics ..... work_sync_workload_duration_seconds_bucket{result="success",le="0.512"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="1.024"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="2.048"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="+Inf"} 4 work_sync_workload_duration_seconds_sum{result="success"} 0.075089723 work_sync_workload_duration_seconds_count{result="success"} 4
So, the normal operation of the
/metrics
API does not originally rely on the content proposed by this PR.However, @wangxf1987 has needs to use
Prometheus
to collect metrics, as described in this tutorial: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md#using-podmonitorstake
PodMonitor
as an example:apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: example-app labels: team: frontend spec: selector: matchLabels: app: example-app podMetricsEndpoints: - port: web
In practice, the
spec.selector
label tells Prometheus which Pods should be scraped, whilespec.podMetricsEndpoints
defines an endpoint serving Prometheus metrics to be scraped byPrometheus
.Pay attention to
spec.podMetricsEndpoints
:the official API recommends to use
port
field, which representsName of the Pod port which this endpoint refers to
, instead of deprecated filedtargetPort
which representsName or number of the target port of the Pod object behind the Service
.So, here comes the problem, we didn't specify the
port
field in container description ofkarmada-controller-manager
and give it a name, which @wangxf1987 can reference it inspec.podMetricsEndpoints
ofPodMonitor
.this is the reason for why this PR proposed.
Hi @wangxf1987, am I right?
Yes, the por name needs to be specified in PM, so the metrices port name needs to be defined in the contaienr.
I think we also need a new PR describing how to monitor the search component. Similarly, you need to modify the port name in the search service.
---
apiVersion: v1
kind: Service
metadata:
name: {{ $name }}-search
namespace: {{ include "karmada.namespace" . }}
labels:
{{- include "karmada.search.labels" . | nindent 4 }}
spec:
ports:
- name: https # add this
port: 443
protocol: TCP
targetPort: 443
selector:
{{- include "karmada.search.labels" . | nindent 4 }}
Let me talk about my understanding of this PR: First, our
karmada-controller-manager
does listened8080
port as metrics querying API, we defined it in https://github.com/karmada-io/karmada/blob/e6c92d5cba0edfa81e920bfff98db1b4be257d82/cmd/controller-manager/app/controllermanager.go#L161we can try as:
$ kubectl --context karmada-host exec -it karmada-controller-manager-cdbbf75dd-69fxb -n karmada-system -- sh / # curl http://127.0.0.1:8080/metrics ..... work_sync_workload_duration_seconds_bucket{result="success",le="0.512"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="1.024"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="2.048"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="+Inf"} 4 work_sync_workload_duration_seconds_sum{result="success"} 0.075089723 work_sync_workload_duration_seconds_count{result="success"} 4
even without this PR, we can still access
/metrics
API ofkarmada-controller-manager
from other pod, just like:$ kubectl --context karmada-host exec -it karmada-descheduler-7d654cbff6-5jmrt -n karmada-system -- sh # here, `10.244.0.12` is the pod IP of `karmada-controller-manager`. / # curl http://10.244.0.12:8080/metrics ..... work_sync_workload_duration_seconds_bucket{result="success",le="0.512"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="1.024"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="2.048"} 4 work_sync_workload_duration_seconds_bucket{result="success",le="+Inf"} 4 work_sync_workload_duration_seconds_sum{result="success"} 0.075089723 work_sync_workload_duration_seconds_count{result="success"} 4
So, the normal operation of the
/metrics
API does not originally rely on the content proposed by this PR. However, @wangxf1987 has needs to usePrometheus
to collect metrics, as described in this tutorial: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md#using-podmonitors takePodMonitor
as an example:apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: example-app labels: team: frontend spec: selector: matchLabels: app: example-app podMetricsEndpoints: - port: web
In practice, the
spec.selector
label tells Prometheus which Pods should be scraped, whilespec.podMetricsEndpoints
defines an endpoint serving Prometheus metrics to be scraped byPrometheus
. Pay attention tospec.podMetricsEndpoints
: https://github.com/prometheus-operator/prometheus-operator/blob/5b880a05ab72112ff26ac3c3eb5bffddbafbc468/pkg/apis/monitoring/v1/podmonitor_types.go#L169-L184 the official API recommends to useport
field, which representsName of the Pod port which this endpoint refers to
, instead of deprecated filedtargetPort
which representsName or number of the target port of the Pod object behind the Service
. So, here comes the problem, we didn't specify theport
field in container description ofkarmada-controller-manager
and give it a name, which @wangxf1987 can reference it inspec.podMetricsEndpoints
ofPodMonitor
. this is the reason for why this PR proposed. Hi @wangxf1987, am I right?Yes, the por name needs to be specified in PM, so the metrices port name needs to be defined in the contaienr.
when the port is defined by default, it it used by prothumes. like this.
I think we also need a new PR describing how to monitor the search component. Similarly, you need to modify the port name in the search service.
--- apiVersion: v1 kind: Service metadata: name: {{ $name }}-search namespace: {{ include "karmada.namespace" . }} labels: {{- include "karmada.search.labels" . | nindent 4 }} spec: ports: - name: https # add this port: 443 protocol: TCP targetPort: 443 selector: {{- include "karmada.search.labels" . | nindent 4 }}
It looks really fantastic! 👍
Did you do this monitoring through the existing karmada document or create it through your personal way?
By the way, I actually very much hope you can share the complete operation steps of your build monitoring in the form of documents ٩(๑❛ᴗ❛๑)۶. If you also willing to contribute a document to karmada, you can refer to this doc.
@chaosi-zju Would you like to run a test as per this document, and share it with us at a community meeting?
Would you like to run a test as per this document, and share it with us at a community meeting?
OK
…erviceMonitor
What type of PR is this? This is improve for installation, the port of karmada-apiserver is exposed, but other componment is not.
What this PR does / why we need it:
Which issue(s) this PR fixes: Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: