Open LaurentDumont opened 3 years ago
It's hard to tell exactly what the second graph is showing because the time axis is not shown. I need to take a look at the docs_total metric and how elasticsearch handles the metric internally, but that could be related to a refresh (elasticsearch term), but I'm not potitive. The elasticsearch_clusterinfo_last_retrieval_success_ts
metric actually records the timestamp of the most recent success, so that will go up forever (as time always moves forward). That metric actually looks correct to me. The cluster info is not updated every scape. See the --es.clusterinfo.interval
flag on the executable.
Ah, got for clusterinfo
, I do see the updated log from the exporter itself.
That said, looking at other metrics, there always seem to be an interval of increase that I cannot match to the scrape interval. You can ignore the small bump 15:20, this was us restarting the exporter.
Assuming the flow is
HttpRequest to /metrics for exporter --> Exporter --> ElasticSearch --> Fetch current counters from /_nodes/stats and /_all/_stats
, I can't explain the slow updates of the metrics in Prometheus.
I guess it's possible that there is some hidden aggregation done inside Elasticsearch. But if the out is parsed from /_nodes/stats and /_all/_stats
, I can see it clearly change in the ELK Tools tab.
So there could also be something else in play. Looking at another metric + switching to a 1 minute interval, I can clearly see it updates every 60 seconds.
Hey everyone,
I'm trying to leverage the exporter to get some "real-time" statistics regarding our ELK cluster.
It's a singlenode and the usage is pretty low so I know that it's quick to answer (looking at the Dev tools for the request time for
/_nodes/stats
and/_all/_stats
show that it takes about 100ms each to answer.I've setup Prometheus to scrape the exporter every 10 seconds, thinking that it would let the metric refresh in between and I could get enough granularity.
But I can see that the retrieval metric is around 2 minutes still - using elasticsearch_clusterinfo_last_retrieval_success_ts (if it means what I think it does)
Prometheus config
I can see that Prometheus does scrape the exporter every 10 seconds in the Targets page of the server.
But if I look at the stats actually collected, it can take up to 20 minutes for the data to update (for metrics that I know should move faster)
Using
elasticsearch_indices_docs_total
for an indice that I know is actively receiving docs.Any ideas?