Closed inahga closed 8 months ago
Hi, wow!! Thank you for this. Sorry for the lateness, I was out this weekend with the family. I'll look at this soon!
How have your instances been this last week, @inahga?
I'll sleep better once this is merged 🙂
How have your instances been this last week, @inahga?
I'll let you know by Monday, I was slow to deploy it this week.
@DzigaV We've run this in production for a week now, with no signs of leakage. Feeling pretty confident this fixed it.
That's great news!! Thank you for the contribution and the work required to get it figured out. Glad it's working :+1:
This closes https://github.com/mholt/caddy-ratelimit/issues/36.
Fixes a race condition between ringBufferRateLimiter creation and its insertion into a map. Do this by locking the entire map when we get or insert a ringBufferRateLimiter.
I have replaced use of sync.Map with a normal
map[string]*ringBufferRateLimiter
and async.Mutex
. They are passed around with arateLimitersMap
struct. I've factored out logic into methods ofrateLimitersMap
, which enables some careful use of deferrlm.mu.Unlock()
to avoid leaving a lock held open onpanic()
.We didn't see a need for a sync.Map. The docs suggest against using it for type safety, and none of the suggested use cases apply. https://pkg.go.dev/sync#Map. Let me know if I'm misunderstanding the use case (very possible!).
I've removed the sync.Pool, for now. Since ringBufferRateLimiter creation and insertion is fully synchronized, I didn't see a need for it.
Note that some of the defensive refactoring is not strictly required--I have a change that preserves the existing data structures, but I think the suggested changeset is an overall improvement in maintainability. https://github.com/divviup/caddy-ratelimit/pull/1/commits/65ad951ea012a5410dff297efa9da6f769e20dc0.
Some discussion of the performance impact and profiles is here https://github.com/divviup/caddy-ratelimit/pull/1. TL;DR, no meaningful impact to CPU, memory, or latency. This implementation could be optimized by replacing the normal mutex with a RWMutex, but it would be a marginal improvement (if any) in exchange for much more complicated locking semantics.
I suggest waiting before merging this, while we let it sit in production for a while longer, to ensure the problem is fully fixed. I'll let it sit for about a week. In the mean time, this is ready for code review.