aerogear / keycloak-metrics-spi

Adds a Metrics Endpoint to Keycloak
Apache License 2.0
530 stars 152 forks source link

Exporter metrics do not persist a Keycloak service restart #76

Closed nikosmeds closed 3 years ago

nikosmeds commented 3 years ago

Description

Restarting keycloak.service causes Prometheus counters to reset, e.g. keycloak_logins metrics reset from a service restart.

Expected Behavior

Ideally the Prometheus counters persist across service restarts (or a flag was created which allowed the operator to select counter reset behavior).

This functionality would allow us to track things like:

Right now we are limited by Prometheus' data retention settings, and the irregular resets can make graphs a bit awkward.

Is such a thing possible?

nikosmeds commented 3 years ago

Also, this isn't related to my issue - but does anyone know the difference between the following two metrics

or are these monitoring the same events?

pb82 commented 3 years ago

@nikosmeds I don't think that it can be done easily because the SPI just counts certain events without persisting them. As far as I know, Keycloak does (or can) persist events in the database. So in principle it might be possible by exportng the persisted data instead of hooking into the live events.

or are these monitoring the same events?

In this case they monitor the same events: keycloak_logins is an explicit metric that is recorded by the SPI while keycloak_user_event_CLIENT_LOGIN is one of the generic metrics generated from the list of all user events: https://github.com/aerogear/keycloak-metrics-spi/blob/master/src/main/java/org/jboss/aerogear/keycloak/metrics/PrometheusExporter.java#L93

nikosmeds commented 3 years ago

Thanks for the quick reply @pb82. Okay, I'll look into this further and post an update if I can find a solution.

pb82 commented 3 years ago

@nikosmeds Ok, i'll close this for now. Feel free to update this or submit a PR with instructios if you find a better solution. Thanks!

nikosmeds commented 3 years ago

Well, we've found an alternative solution. It doesn't relate to the keycloak-metrics-spi but I'll share here anyways for anyone who runs into a similar problem.

We use Filebeat and ship all the Keycloak logs to our ELK stack - these log entries include every login and registration event. Thus we can query Kibana to see those events over a longer period of time.

Still making use of this exporter and Grafana dashboards for real-time monitoring and reviewing recent events. Thanks!