Open michaldo opened 5 days ago
Could you please verify if the two outputs are the same and you are not having four times more time series in the case of Prometheus 1.x?
How many time series (or lines) you have in your Prometheus output approximately?
Also, will the second and subsequent scrapes consume equal amount of memory or are they more lightweight?
What tool you used for the flame graph?
Will you get similar results if you look at a heap dump?
Also, Micrometer's overhead on the flame graphs seems pretty minimal to me, depending on the answer of the questions above, maybe this should be a Prometheus Client question at the end?
Hi Jonatan, I check memory usage on test cloud cluster when only micrometer-registry-prometheus is changed. So answer for question 1 and 3 is that environment is stable and memory consumption is stable.
It may be important that my application heavy use Kafka and prometheous output contains 4703 Kafka rows out of 4850 total Flame graph is generated with Intellij
After profiler output inspection, suspected fragments of code are io.micrometer.core.instrument.config.NamingConvention#toSnakeCase
and io.prometheus.metrics.model.snapshots.PrometheusNaming#replaceIllegalCharsInMetricName
. They can be refactor with less memory requirements
After profiler output inspection, suspected fragments of code are
io.micrometer.core.instrument.config.NamingConvention#toSnakeCase
andio.prometheus.metrics.model.snapshots.PrometheusNaming#replaceIllegalCharsInMetricName
. They can be refactor with less memory requirements
This conflicts with your flame graphs, I don't see any of these classes on it.
NamingConvention
has not been changed for about two years. PrometheusNamingConvention
did change in 1.13 (Boot 3.3) but now it is doing less work and it is delegating to io.prometheus.metrics.model.snapshots.PrometheusNaming
.
If you are really suspicious that io.prometheus.metrics.model.snapshots.PrometheusNaming
is the root cause, you might want to create an issue for Prometheus Client.
Indeed, suspected code I found has small impact on flame graph.
It is hard to determine root cause of memory consumption. I see that high level code is changed and memory consumption is visible on low level code - root cause is high level code recently changed or low level code not changed for years?
Anyway, I keep my opinion that pointed out methods are inefficient and micrometer-registry-prometheus.1.13.0 use more memory that micrometer-registry-prometheus-simpleclient.1.13.0
It is hard to determine root cause of memory consumption.
Analyzing a heap dump might help.
I see that high level code is changed and memory consumption is visible on low level code - root cause is high level code recently changed or low level code not changed for years?
I don't understand this question.
Anyway, I keep my opinion that pointed out methods are inefficient and micrometer-registry-prometheus.1.13.0 use more memory that micrometer-registry-prometheus-simpleclient.1.13.0
I never stated the opposite, I was trying to point out that you might opened the issue in the wrong repo, Micrometer is not the same as Prometheus Client (where io.prometheus.metrics.model.snapshots.PrometheusNaming
is from).
I observed that micrometer registry consumes more memory when I switch Spring Boot from 3.2.6 to 3.3.1
Environment
I attached profiler report collected by async profiler.
Green is memory consumption for micrometer-registry-prometheus:
scrape()
takes 446 MB Red is memory consumption for micrometer-registry-prometheus-simpleclient (I keep Spring Boot 3.3.1 but switch to simple client): 121 MB