knyar / nginx-lua-prometheus

Prometheus metric library for Nginx written in Lua
MIT License
1.46k stars 231 forks source link

replace safe_set with set #147

Closed kingluo closed 2 years ago

kingluo commented 2 years ago

When the number of metrics increase, especially due to label value changes or add/remove labels on reload, it's likely to use up the shdict, when mostly we want to remove the old items in shdict making use of the lru cache mechanism of shdict. The old items are supposed to be ignored.

In fact, even we insist safe_set, the inc we used in other places of codes may also purge old items. So why not make them consistent?

knyar commented 2 years ago

Thank you for proposing this.

In general, I believe that the library should maintain data integrity and never lose metric values. If shdict gets full and lru eviction kicks in, it should be treated as an error: nginx_metric_errors_total should be incremented, and error log messages should say that reported metric values might be incomplete.

safe_set is used as an attempt to maintain data integrity, but I think you are absolutely right that eviction can also happen when a previously nonexistent dictionary key is incremented using :incr. The fact that we do not detect evictions in such cases is a bug.

What I think would be a good way forward is to proceed with replacing safe_set with set, but also catch the forcible return value every time we write to a shdict (using set or incr) and report errors when it becomes true (which means that the dict is full).

kingluo commented 2 years ago

I agree. Let me have a try.

kingluo commented 2 years ago

@knyar Please review my commits and approve the github workflow, thanks.

knyar commented 2 years ago

Thank you! I am going to merge this into a separate branch to make a few follow-up adjustments before merging this to master.