OpenLiberty / open-liberty

Open Liberty is a highly composable, fast to start, dynamic application server runtime environment
https://openliberty.io
Eclipse Public License 2.0
1.16k stars 597 forks source link

Grafana Dashboard Metrics (2.x-4.x) Not Populating from Prometheus Source #29732

Open jrbhagwat opened 1 month ago

jrbhagwat commented 1 month ago

Bug Description

We are experiencing an issue with the Open Liberty Grafana dashboard (version 11706) where it does not populate metrics data from the Prometheus source without customisation. This problem is preventing effective monitoring of the Open Liberty application running in our Kubernetes environment.

Dashboard that has issue is: https://grafana.com/grafana/dashboards/11706-open-liberty/

Steps to Reproduce

Here’s a detailed set of steps to reproduce the bug with the Open Liberty Grafana dashboard. This includes the process of logging into Grafana, importing the dashboard, and checking the metrics data and filters.

  1. Login to Grafana:

    • Open your web browser.
    • Navigate to your Grafana instance (e.g., http://<your-grafana-url>:<port>).
    • Enter your Grafana credentials to log in.
  2. Import the Open Liberty Dashboard:

    • In the Grafana sidebar, click on the “+” icon (Create) and select “Import”.
    • In the Import via grafana.com section, enter the dashboard UID 11706 (or the correct UID for the Open Liberty dashboard).
    • Click on the “Load” button.
    • Select your Prometheus data source from the “Prometheus” dropdown.
    • Click “Import” to add the dashboard.
  3. Open the Imported Dashboard:

    • After importing, you will be redirected to the dashboard.
    • Verify that the dashboard is displayed correctly with its panels.
  4. Check Metrics Data:

    • Observe the panels to see if metrics data is being populated.
    • Confirm that data is not displayed as expected.
  5. Inspect Instance Filter Dropdown:

    • Locate the Instance Filter dropdown on the dashboard.
    • Click on the dropdown to view the available options.
    • Verify that the listed instances do not match the expected Prometheus metrics or represent the correct Open Liberty instances, as it includes additional key-value details along with the instance IP.
    • This discrepancy can indicate why Prometheus metrics data is not populated.
  6. Review Prometheus Data Source Configuration:

    • Go to Configuration (gear icon) in the sidebar.
    • Select Data Sources.
    • Click on your Prometheus data source to view its settings.
    • Ensure that the URL and Access settings are correctly configured for your Prometheus instance.
  7. Revisit the Dashboard:

    • Go back to the dashboard and refresh it to check if metrics data appears after ensuring all configurations are correct.

Expected behavior

The dashboard should automatically display relevant Prometheus metrics data for the Open Liberty instances without manual customization, and it will populate metrics correctly if the instances dropdown lists only the instance IPs without extra key-value pairs appended.

Diagnostic information:

Additional context

We can see a fixed version available in yml format here https://github.com/OpenLiberty/open-liberty-operator/blob/main/deploy/dashboards/metrics/RHOCP4.3-GrafanaOperator3.0.2-Grafana5.2/open-liberty-grafana-mpMetrics2.x.yml

But we need it to be in .json format so that we can be able to import the dashboard (https://grafana.com/grafana/dashboards/11706-open-liberty/) with ID or as JSON.

Request:

I kindly request that the Open Liberty team investigate this issue and consider publishing a fixed version of the Grafana dashboard https://grafana.com/grafana/dashboards/11706-open-liberty/ for Kubernetes in the Grafana marketplace. This would greatly enhance the usability and functionality of monitoring Open Liberty applications.

donbourne commented 1 month ago

@jrbhagwat , thanks for bringing this to our attention - we'll investigate.

pgunapal commented 1 month ago

Hi @jrbhagwat, we tried to follow the reproduction steps you had provided in the issue, however, we could not reproduce the issue. When we imported the Open Liberty Grafana dashboard (version 11706) into Grafana, the graphs in the dashboard were automatically populated with relevant metrics data, and the correct Open Liberty instance was listed in the Instance Filter Dropdown menu.

Can you please let us know the versions of OpenLiberty, mpMetrics, Prometheus, and Grafana, you had used to encounter the problem? It would be good to also include the Prometheus configuration YAML file and any other configurations you may of had, to help us diagnose the issue better.

Furthermore, it would be good, if you can reproduce the same issue, by following the setup instructions from this blog: https://openliberty.io/blog/2020/04/09/microprofile-3-3-open-liberty-20004.html#gra

Thanks!

jrbhagwat commented 1 month ago

Hi @pgunapal

OpenLiberty versions tried: 24.0.0.8 and 24.0.0.9 including version range mentioned above in previous message mpMetrics version: mpMetrics-2.3

Versions of Prometheus and Grafana we are using are as follows:

Screenshot 2024-10-03 at 10 38 49 AM

Sample Installation steps: helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm install prometheus prometheus-community/prometheus -n prom -f prom-values.yaml --create-namespace

Screenshot 2024-10-03 at 10 54 57 AM

helm repo add grafana https://grafana.github.io/helm-charts helm install grafana -n prom grafana/grafana -f grafana-values.yaml --create-namespace

pgunapal commented 1 month ago

@jrbhagwat You had mentioned in the issue description, that the YML format Grafana dashboard works (https://github.com/OpenLiberty/open-liberty-operator/blob/main/deploy/dashboards/metrics/RHOCP4.3-GrafanaOperator3.0.2-Grafana5.2/open-liberty-grafana-mpMetrics2.x.yml).

We have an equivalent JSON version of it here, can you please give it a try to see if it resolve the issue? https://github.com/OpenLiberty/open-liberty-operator/blob/main/deploy/dashboards/metrics/OKD3.11-Grafana5.2/open-liberty-grafana-mpMetrics2.x.json

jrbhagwat commented 1 month ago

Hi @pgunapal

YML format one has fix for regex like here: *"regex": "/base_cpu_processCpuLoad_percent{.pod=\"(.?)\"./",**

We tried to use JSON equivalent you have mentioned above too (https://github.com/OpenLiberty/open-liberty-operator/blob/main/deploy/dashboards/metrics/OKD3.11-Grafana5.2/open-liberty-grafana-mpMetrics2.x.json). However, we can see that importing that JSON directly to Grafana Dashboard does not populate data which needs customisation with Regex.

Regex (*"regex": "/base_cpu_processCpuLoad_percent{.pod=\"(.)\",service=./",**) in the JSON file https://github.com/OpenLiberty/open-liberty-operator/blob/main/deploy/dashboards/metrics/OKD3.11-Grafana5.2/open-liberty-grafana-mpMetrics2.x.json is not working.

If we customize the above json file by changing the regex to *"regex": "/base_cpu_processCpuLoad_percent{.pod=\"([^\"])\".}/"**, and then import that JSON in grafana dashboard, then only it works.

Screenshot 2024-10-14 at 1 41 27 PM

Request:

I kindly request that the Open Liberty team consider publishing a fixed version of regex in https://github.com/OpenLiberty/open-liberty-operator/blob/main/deploy/dashboards/metrics/OKD3.11-Grafana5.2/open-liberty-grafana-mpMetrics2.x.json. This would greatly enhance the usability and functionality of monitoring Open Liberty applications without any customisation from our side.

pgunapal commented 2 weeks ago

@jrbhagwat We recently published a new JSON version of the dashboard here. Can you please give it a try?

jrbhagwat commented 3 days ago

Yes importing this JSON version works. Thanks @pgunapal Import using ID 11706 https://grafana.com/grafana/dashboards/11706-open-liberty/ still does not work.