spring-cloud / spring-cloud-dataflow

A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
Apache License 2.0
1.09k stars 577 forks source link

Grafana Prometheus streams dashboard shows incorrect values when using multiple server instances #5357

Open klopfdreh opened 1 year ago

klopfdreh commented 1 year ago

Description: Currently there is an issue with the Prometheus metrics of SCDF server. For example http_server_requests_seconds_max of any path is showing the value 0.0 even if I navigate through the UI.

Release versions: 2.10.3

Custom apps:

Steps to reproduce: Setup spring-cloud-dataflow-server with prometheus-rsocket-proxy and see /metrics/connected endpoint.

Screenshots: image

Note: We created our own artifact. based on https://github.com/spring-cloud/spring-cloud-dataflow/tree/v2.10.3/spring-cloud-dataflow-server That is the reason why there is a 1.0.63 mentioned.

Additional context: The metrics are provided, but the count somehow is not working.

This is a Spring Boot standard metric, so I guess there is something broken in 2.7.x

klopfdreh commented 1 year ago

I found the issue - it is when you scale up the instances in kubernetes to 2 and both servers are exporting the metrics at the same name

      application: myservername
klopfdreh commented 1 year ago

I got the dasbhoard from here: https://grafana.com/grafana/dashboards/9933-streams/ and this might be changed so that the application is check that it starts with a pattern so that you can name the application with myservername-1 and myservername-2 or myservername-randomidentifier

klopfdreh commented 1 year ago

The dashboard should be adjusted so that it use =~ in the metrics.

Variable Value: SERVER_APPLICATION_NAME=myservername.* (the .* is important to match all pods) Env-Variable: MY_POD_NAME = myservername-3h35f2t3d-rcg8d


"expr": "process_uptime_seconds{application=~\"${SERVER_APPLICATION_NAME}\"}",


      application: ${MY_POD_NAME}

SCDF deployment env-variables:

            - name: MY_POD_NAME
                  apiVersion: v1
                  fieldPath: metadata.name
klopfdreh commented 1 year ago

Hope this helps for a kubernetes setup with more than 1 replica. 👍

klopfdreh commented 1 year ago

Other than that you could create selection to choose between the servers in the dashboard.

onobc commented 1 year ago

We could implement @klopfdreh suggested fix (or something similar) in:

  1. Dashboard(s) we provide in SCDF repo
  2. Dashboard(s) in Grafana labs (https://grafana.com/grafana/dashboards/9933-streams/)

I am not sure what is involved in 2nd item.