Question: Prometheus metrics with nginx-ingress scaled to >=2 pods yield inconsistent output when queried from outside the cluster

b2cc commented 1 year ago

FYI: Since this is only a question, I tried to follow the "Support" link, but I couldn't sign up to Slack, so I'm creating this issue.

We have a basic/default setup of the ingress controller deployed via helm chart on OKD 4.12. Loadbalancing is supplied via metal-lb and everything works - apps are accessible and ingresses work as expected. For redundancy we currently run the deployment with 4 replicas, one of which receives all the traffic, and the three other ones just idling for fail-over purpose.

We're now in the process of setting up monitoring/metering with Prometheus as per the documentation (https://kubernetes.github.io/ingress-nginx/user-guide/monitoring/). Since Prometheus is also used for monitoring of other components we run it outside the cluster on a dedicated server. Therefore the service to expose the Prometheus endpoint is of type NodePort so Prometheus is able to scrape it from outside the cluster.

This works and we can confirm by running curl on the /metrics endpoint the NodePorts that we can get some metrics. However most of the time we seem to hit an idle replica pod and most the metrics are empty due to the way the service loadbalancing works (roundrobin?). Only sometimes we hit the pod that is actually handling all the traffic and get the correct values.

We scoured the documentation, but we couldn't find a way around this issue.

Are we missing a certain configuration?
Must Prometheus run inside the ingress-nginx namespace for this to work?

NGINX Ingress controller version

/nginx-ingress-controller --version
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.5.1
  Build:         d003aae913cc25f375deb74f898c7f3c65c06f05
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.21.6

Kubernetes version (use kubectl version):

oc version
Client Version: 4.12.0-0.okd-2023-02-04-212953
Kustomize Version: v4.5.7
Server Version: 4.12.0-0.okd-2023-02-04-212953
Kubernetes Version: v1.25.0-2653+a34b9e9499e6c3-dirty

Environment:

Openshift/OKD 4.12 cluster x86_64

How was the ingress-nginx-controller installed: installed via helm, no user-supplied values (default)

helm ls -A | grep -i ingress
ingress-nginx ingress-nginx  1  2023-01-27 20:09:37.380997344 +0100 CET deployed  ingress-nginx-4.4.2 1.5.1

Current State of the controller:


oc get pod -o wide
NAME                                        READY   STATUS    RESTARTS   AGE    IP             NODE
ingress-nginx-controller-779798ff78-lhdcf   1/1     Running   0          113m   10.132.0.214   compute01
ingress-nginx-controller-779798ff78-nsrzl   1/1     Running   0          113m   10.133.0.147   compute03
ingress-nginx-controller-779798ff78-qgzdb   1/1     Running   0          113m   10.134.0.94    compute02
ingress-nginx-controller-779798ff78-xk587   1/1     Running   0          113m   10.135.0.94    compute04

oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ingress-nginx-controller LoadBalancer 172.31.13.57 10.1.200.199 80:32439/TCP,443:30691/TCP 30d ingress-nginx-controller-admission ClusterIP 172.31.244.148 443/TCP 30d nodeport-ingress-nginx-prometheus NodePort 172.31.29.28 10254:30254/TCP 102m

oc describe ingressclasses.networking.k8s.io Name: nginx Labels: app.kubernetes.io/component=controller app.kubernetes.io/instance=ingress-nginx app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=ingress-nginx app.kubernetes.io/part-of=ingress-nginx app.kubernetes.io/version=1.5.1 helm.sh/chart=ingress-nginx-4.4.2 Annotations: meta.helm.sh/release-name: ingress-nginx meta.helm.sh/release-namespace: ingress-nginx Controller: k8s.io/ingress-nginx Events:

Name: openshift-default Labels: Annotations: Controller: openshift.io/ingress-to-route Parameters: APIGroup: operator.openshift.io Kind: IngressController Name: default

oc describe pod ingress-nginx-controller-779798ff78-lhdcf Name: ingress-nginx-controller-779798ff78-lhdcf Namespace: ingress-nginx Priority: 0 Service Account: ingress-nginx Node: compute01/10.1.200.203 Start Time: Sun, 26 Feb 2023 19:26:09 +0100 Labels: app=ingress-nginx app.kubernetes.io/component=controller app.kubernetes.io/instance=ingress-nginx app.kubernetes.io/name=ingress-nginx pod-template-hash=779798ff78 Annotations: k8s.v1.cni.cncf.io/network-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.132.0.214" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.132.0.214" ], "default": true, "dns": {} }] kubectl.kubernetes.io/restartedAt: 2023-02-26T19:24:52+01:00 openshift.io/scc: privileged Status: Running IP: 10.132.0.214 IPs: IP: 10.132.0.214 Controlled By: ReplicaSet/ingress-nginx-controller-779798ff78 Containers: controller: Container ID: cri-o://275eb7160f8b1c8c9b9b591be148dabd225dadd94fdbe43db88cd77eb1cf3f1c Image: registry.k8s.io/ingress-nginx/controller:v1.5.1@sha256:4ba73c697770664c1e00e9f968de14e08f606ff961c76e5d7033a4a9c593c629 Image ID: registry.k8s.io/ingress-nginx/controller@sha256:2f7551977e8553a50cd88e8175b1411acbef319f7040357b58be95e9b99c07e5 Ports: 80/TCP, 443/TCP, 8443/TCP, 10254/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP Args: /nginx-ingress-controller --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller --election-id=ingress-nginx-leader --controller-class=k8s.io/ingress-nginx --ingress-class=nginx --configmap=$(POD_NAMESPACE)/ingress-nginx-controller --validating-webhook=:8443 --validating-webhook-certificate=/usr/local/certificates/cert --validating-webhook-key=/usr/local/certificates/key State: Running Started: Sun, 26 Feb 2023 19:26:12 +0100 Ready: True Restart Count: 0 Requests: cpu: 100m memory: 90Mi Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5 Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3 Environment: POD_NAME: ingress-nginx-controller-779798ff78-lhdcf (v1:metadata.name) POD_NAMESPACE: ingress-nginx (v1:metadata.namespace) LD_PRELOAD: /usr/local/lib/libmimalloc.so TZ: Europe/Vienna Mounts: /usr/local/certificates/ from webhook-cert (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r5djr (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: webhook-cert: Type: Secret (a volume populated by a Secret) SecretName: ingress-nginx-admission Optional: false kube-api-access-r5djr: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux node-role.kubernetes.io/worker= Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s



/kind documentation
/remove-kind bug

k8s-ci-robot commented 1 year ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

longwuyuan commented 1 year ago

This is not tested in CI
There are very few users with similar environment active on this github project
Use case of prometheus outside Kubernetes cluster is not a scope of this project
Use case of nodePort for exposing prometheus is also not a scope of this project
I think a prometheus expert need to engage here or alternative you could discuss this in the prometheus forums like the prometheus related channels in K8S slack
Personally I think the total-value of fighting the config and then managing the observability related operations is equal to paying for a commercial offering

/help

k8s-ci-robot commented 1 year ago

@longwuyuan: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

Why are we solving this issue?
To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
Does this issue have zero to low barrier of entry?
How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to [this](https://github.com/kubernetes/ingress-nginx/issues/9672): >- This is not tested in CI >- There are very few users with similar environment active on this github project >- Use case of prometheus outside Kubernetes cluster is not a scope of this project >- Use case of nodePort for exposing prometheus is also not a scope of this project >- I think a prometheus expert need to engage here or alternative you could discuss this in the prometheus forums like the prometheus related channels in K8S slack >- Personally I think the total-value of fighting the config and then managing the observability related operations is equal to paying for a commercial offering > >/help Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

b2cc commented 1 year ago

@longwuyuan : thanks for your explanation.

Let's forget the Nodeport and external monitoring for a second, and just focus on the Prometheus service. Even when deployed as per your documentation, the issue persists as soon as the ingress-nginx deployment is scaled up. This seems to be a valid scenario, since even your helm chart implements an autoscaling mechanism.

Surely I'm not the first one to run more than one ingress pod on a kubernetes cluster?

Or the other way around: what is the supported scenario - must the ingress-nginx deployment be run with only one replica? Is this defined somewhere?

longwuyuan commented 1 year ago

@b2cc you are absolutely right. The multi pod scene is a grey area. Hence I tagged this as "help wanted" from a expert on integration.

There are multiple instances where have hinted at having a scaled out multi-pod/multi-replica being used with the Prometheus+Grafana combo. But its just that, a hint at having multiple replicas with no deep dive into the deployment and config of prom-grafana.

Next step is that someone needs to at least reproduce this and post the detailed data of the missing stuff from a multi replica environ. Although I think that most metrics will be available off of the leader.

praveenmak commented 1 year ago

@longwuyuan
I have a follow up question. How do I see the metrics for "tcp" services by port. I can see metrics for "http" traffic not for "tcp" traffic. Any idea how to view that in prometheus?

marinflorin commented 1 year ago

@b2cc, @longwuyuan, why would be an issue having metrics per instance/replica?

Prometheus works fine with multiple instances of ingress-nginx. Each replica will hold its own metrics, and you create a target for each replica. e.g., for 2 replicas, you will have 2 targets...... for n replicas, you will have n Prometheus targets.

Once the above conditions are met, Prometheus will start scrapping metrics per instance, and you will have metrics per instance ( the labels are different so you will have n metrics ).

e.g. for 2 replicas, you will have 2 metrics for nginx_ingress_controller_nginx_process_requests_total.

nginx_ingress_controller_nginx_process_requests_total{container="controller", controller_class="k8s.io/ingress-nginx", controller_namespace="ingress-controller", controller_pod="nginx-external-ingress-nginx-controller-8f8dbf497-47znb", endpoint="http-metrics", instance="10.6.15.250:10254", job="nginx-external-ingress-nginx-controller-metrics", namespace="ingress-controller", pod="nginx-external-ingress-nginx-controller-8f8dbf497-47znb", service="nginx-external-ingress-nginx-controller-metrics"}
nginx_ingress_controller_nginx_process_requests_total{container="controller", controller_class="k8s.io/ingress-nginx", controller_namespace="ingress-controller", controller_pod="nginx-external-ingress-nginx-controller-8f8dbf497-lppc9", endpoint="http-metrics", instance="10.6.33.183:10254", job="nginx-external-ingress-nginx-controller-metrics", namespace="ingress-controller", pod="nginx-external-ingress-nginx-controller-8f8dbf497-lppc9", service="nginx-external-ingress-nginx-controller-metrics"}

Which is correct and allows you to monitor each instance, or you can simply monitor the cluster by doing a sum over the above metrics:

sum(irate(nginx_ingress_controller_requests{ controller_class=~"$controller_class", ingress=~"$ingress", namespace=~"$namespace", controller_pod=~"$pod"}[$__interval])) by (ingress)

This is how Prometheus works for any serviceMonitor, going deeper would be a question better suited for the Prometheus Community since that knowledge is not in the Nginx scope.

Hopefully, this addresses your question.

longwuyuan commented 1 year ago

They grey area in this issue is prometheus running outside the cluster. I have not had the time to run prometheus outside the cluster and have the scraping done on the controller replicas running inside the cluster.

marinflorin commented 1 year ago

@longwuyuan, as long you create the right targets, it doesn't matter where you run Prometheus, although running it outside the cluster seems overkill because you also need to address the following:

make metrics for each instance accessible from outside the cluster
you would need a service discovery or something similar because the containers are ephemeral and usually short-lived processes. e.g., you have two replicas of Nginx, in which you expose the metrics using NodePort and ten nodes, CA can scale down one of the nodes where Nginx is running, and that replica will be re-scheduled on another node. Then you need to update your Prometheus targets to scrape the new target and remove the old one.

Why don't you just run Prometheus inside the cluster with remote-write to an external Prometheus cluster for centralized metrics?

Note: this is as deep as we should go since this is an Nginx repo and not Prometheus talk.

@b2cc, if @longwuyuan and my comments helped, please feel free to close the issue.

b2cc commented 1 year ago

@marinflorin : thanks for adding to this issue!

I understand wo a certain degree what you mean, but my issue is exactly in what you mention when you write:

as long you create the right targets, it doesn't matter where you run Prometheus

Currently I created a service, but this doesn't seem to be the correct way to do it, because it just does round-robin over the ingress pods and I'm missing metrics.

Which kind or type of target are you referring to in this case?

longwuyuan commented 2 months ago

Hi,

Reading all the info here after a year now brings up a need to update here.

There are "agent" like configs from prometheus, demonstrated in the SAAS service offered by the Company Grafana Labs. This is a technique to push metrics to a external prometheus server. I think you should explore that. The reason being this work is out of the scope of the core Ingress-API specs and the project i snot able to support/maintain features and use-cases that are too far from the core Ingress-API specs & functionalities. There is a lack of resources like developer time.

For cross namespace use of prometheus, there is docs for serviceMonitor but for out-of-cluster prometheus, it will be helpful to get docs PR contributions. There is no resources to test & document that use-case and prometheus's native documentation is far far superior to any efforts this project can make.

Since there is no action-item this issue tracks now, I will close this issue because the project is requiring to limit the work on features far away from the Ingress-API, while releasing secure by default controller and implementing the Gateway-API.

/close

k8s-ci-robot commented 2 months ago

@longwuyuan: Closing this issue.

In response to [this](https://github.com/kubernetes/ingress-nginx/issues/9672#issuecomment-2342823937): >Hi, > >Reading all the info here after a year now brings up a need to update here. > >There are "agent" like configs from prometheus, demonstrated in the SAAS service offered by the Company Grafana Labs. This is a technique to push metrics to a external prometheus server. I think you should explore that. The reason being this work is out of the scope of the core Ingress-API specs and the project i snot able to support/maintain features and use-cases that are too far from the core Ingress-API specs & functionalities. There is a lack of resources like developer time. > >For cross namespace use of prometheus, there is docs for serviceMonitor but for out-of-cluster prometheus, it will be helpful to get docs PR contributions. There is no resources to test & document that use-case and prometheus's native documentation is far far superior to any efforts this project can make. > >Since there is no action-item this issue tracks now, I will close this issue because the project is requiring to limit the work on features far away from the Ingress-API, while releasing secure by default controller and implementing the Gateway-API. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

kubernetes / ingress-nginx

Question: Prometheus metrics with nginx-ingress scaled to >=2 pods yield inconsistent output when queried from outside the cluster #9672

Guidelines