oliver006 / redis_exporter

Prometheus Exporter for ValKey & Redis Metrics. Supports ValKey and Redis 2.x, 3.x, 4.x, 5.x, 6.x, and 7.x
https://github.com/oliver006/redis_exporter
MIT License
3.04k stars 860 forks source link

Added metrics to display the last 10 slow log commands name , command duration and the execution time #807

Closed alagusivakumar closed 1 year ago

oliver006 commented 1 year ago

Thanks for the PR, I'll try to get to reviewing it soon!

alagusivakumar commented 1 year ago

Thanks for the PR! This can be a useful feature but should be behind a command-line option that's off by default. Putting the commands in a label can 1) drastically blow up the cardinality of metrics (why the timestamp? I don't think that's a good idea) and 2) can be undesired from a privacy/security perspective.

Also, this needs some tests (see slowlog_test.go for inspiration)

Please find the response to your quries 1) Putting the commands in a label can 1) drastically blow up the cardinality of metrics (why the timestamp? I don't think that's a good idea) Commands in the label will help the user to identify which commands caused the slowlogs. In the existing code itself , commands are in the labels for the metrics like redis_commands_total , redis_commands_duration_seconds_total. It is similar to that

In the current exporter code , you are exposing the metrics of the last slowlog. But we donot know when the last slow log has been recorded. For example , consider a scenario where  we have set the retention policy for metrics to 2 days (we can able to see the metrics only for the past 2days) .In that case , If the last slow log  occurred before 2 days ,   we cannot able to pinpoint the time when the last slow has been recorded . In other words , we cannot the able to find the time of the last slow log execution , if the time period between (lastslowlog time and the current time)   is greater than the metrics retention time.

Also , i am aware that slow log contents itself seems not really suitable for exporting. but some clients are having usecase for that . Example: https://github.com/oliver006/redis_exporter/issues/67 . It will be very helpful in the case where we want to observe the redis cpu spike and the slowlog details in the same place.

2) can be undesired from a privacy/security perspective I have added an argument slowlog-history-enabled . By default , the slowlog history metrics will not be exposed. The user have to use the argument slowlog-history-enabled or set the env variable REDIS_EXPORTER_SLOWLOG_HISTORY_ENABLED to enable this feature

oliver006 commented 1 year ago

Commands in the label will help the user to identify which commands caused the slowlogs. In the existing code itself , commands are in the labels for the metrics like ....

That's different though. Those commands only record the general command like GET but not the parameters that were used so there's a big difference in potential cardinality.

The tests are failing on go formatting - can you run go fmt ./... please?

oliver006 commented 1 year ago

I thought about this a bit more and I came to the conclusion this doesn't belong in the exporter, this belongs into a log collection system (loki or whatever you use for logs). Prometheus is mainly for exporting and collecting metrics so I'd suggest you use something different to collect logs like the slowlog.