Open yzengin opened 3 months ago
hello @yzengin
Could you share a minimal Git repository where the issue can be reproduced? Also please provide any error messages or logs you're seeing.
Thank you
@yzengin a video would help us understand your experience. Is this something you can provide?
This is happening on our instance as well running 11.3 ever since the last update, it struggles to load, complaints from end users etc.
@robhamnett if you're able to provide us with information about your current setup (see the Environment section above) and previous setup (before you experienced perf issues), that would be very helpful.
Any further details you can provide will help us, too. For example, does the dashboard in question only rely on the Prometheus data source?
We were running 11.2.2, updated to 11.3 os - container-optimized os v113 agent - victoria metrics v1.106.0
@robhamnett thank you for the quick reply. One significant change in Grafana 11.3 is that Scenes-powered Dashboards are generally available. As a troubleshooting step, could you try adding &scenes=false
to the dashboard's URL? If that results in a performance change, that would be helpful info for us :)
That indeed seemed to have helped.
Great to know, thanks for confirming @robhamnett. I think we should create a new issue to capture your experience, as it's a different situation than the one described by the author of this issue.
@yzengin I wonder if this could be related to a change in Grafana 11 (https://github.com/grafana/grafana/pull/84778), where we default to using the Label Values endpoint over the Series endpoint. The Label Values endpoint can be unacceptably slow sometimes, for reasons that aren't clear yet (see https://github.com/prometheus/prometheus/issues/14551).
If that's indeed the case, I believe we can work around your performance issue. But first, we'd really appreciate it if you could capture the performance issue in a HAR file, so we can examine the request timings. If privacy is an issue, please see https://github.com/grafana/grafana/issues/95370#issuecomment-2459800486 to learn how to sanitize the HAR file.
Next, let's see if using the Series endpoint improves the performance of your dashboard. In your Prometheus data source configuration (specifically, the Performance section), be sure that the Prometheus type
& version
are set. Because your Prometheus version is 2.40.0, you have two options:
version
to anything below 2.24.x
. As described in #84778, this will enforce that the Series endpoint is used instead of the Label Values endpoint.I hope this helps! We look forward to hearing your results :)
This is happening on our instance as well running 11.3 ever since the last update, it struggles to load, complaints from end users etc.
Grafana graphs/data/InfluxDB queries are also causing huge issues since Grafana v11.3 (I'm now running v11.3.1). I notice that the data is loading very slowly. And after several graphs loaded, Grafana fails to load more queries. And basically slowly becomes to a halt. The network tab also showed NS_BINDING_ABORTED
on the api/ds/query
(in my case influxdb) API end-point.
Downgrading back to Grafana v11.2 solved all the performance issues and the queries are very fast again. All graph the load all the data again without any issues after downgrading to v11.2.
So it's definitely not just Prometheus, it's also Influxdb v1 queries.
@NWRichmond
Thank you for the clear explanation and the suggested steps. I’ve implemented the Use series endpoint toggle and adjusted the Prometheus data source configuration as you described. The performance issue has been resolved, and everything is functioning as expected now.
Your guidance was precise and effective in addressing the problem. I'll ensure these adjustments are considered for similar scenarios in the future.
Best regards, @yzengin
What happened?
When using Grafana with large data sets, I've noticed that dashboards become slow, and graphs take a long time to load. This is especially noticeable when querying Prometheus as a data source. The response times for queries significantly increase, negatively impacting the user experience. Additionally, memory usage spikes when handling large volumes of data. Could there be optimizations to improve performance in such scenarios?
What did you expect to happen?
I expected the dashboards to load quickly and the graphs to be responsive even with large data sets. The performance should be consistent, regardless of the data size.
Did this work before?
Yes, previous versions of Grafana worked more smoothly with similar data sets, but I’ve noticed the slowdown with the latest version.
How do we reproduce it?
Steps to Reproduce:
Is the bug inside a dashboard panel?
Yes, the issue occurs within the dashboard panels where large data sets are visualized.
Environment (with versions)?
Grafana: 11.1.4 OS: Ubuntu 20.04 Browser: Google Chrome 127.0.6533.120
Grafana platform?
A package manager (APT, YUM, BREW, etc.)
Datasource(s)?
Prometheus 2.40.0