Closed robertgrimm closed 1 year ago
May or not be related, but I'm currently seeing a lot of Ruby segfaults like this:
ruby/2.2.0/gems/redis-3.2.1/lib/redis/connection/ruby.rb:49: [BUG] object allocation during garbage collection phase
I definitely agree with this issue — I've found that the redis-rb library can easily consume the limited number of connections I'm paying for on a third-party redis service, and so I have to do a lot of nasty application and system configuration to aggressively cleanup the connections.
Wow, I'm surprised this hasn't gotten more traction/feedback from the Redis gem authors?
I've been load testing a collection of microservices hosted on AWS and discovered that our connections weren't ever dropping down once opened. I've only just started looking into why, and this issue popped up first on Google :-(
I'll have to take a look at @hale's commit/fix and fork my own version.
Hey @BBCMarkMcDonnell. We maitain redis-rb in our free time and may not face the same problems as heavy users. Looks like this issue just fell through. @hale's fix was never submitted as a PR, so we didn't really see that either. I now pulled it in (#603) and will wait for the CI to report back.
Thanks for bringing this up again.
Thanks @badboy for the feedback, I wasn't aware of the situation. My apologies. I'm grateful you're looking back into this. Much appreciated!
Hi there! Has there been any progress on this? It looks like #603 has been abandoned since April.
Our Rails application is using redis-rb
on multiple frontend nodes to direct traffic to a Redis cluster on AWS. We upgraded redis-rb
from v3.1.0 to v3.3.1 earlier this week. Since then, our open connection counts have been rising:
I'm not totally sure the redis-rb
upgrade is the cause, but the timing is right. We never added manual disconnect calls in our application - the close
method isn't documented in the readme, and most of the examples don't contain close/disconnect calls either. And everything seemed okay at 3.1.0.
I'll be debugging on our end and will post anything I learn here. Thanks!
I have also manually added a bunch of redis.disconnect!
calls around our application to mitigate this.
I thought I was seeing similar issues and starting to look into it, could not reproduce the original report. @robertgrimm (or anyone else) can you still reproduce this on latest master?
Update on our particular situation: I haven't found or fixed the cause of our frontends leaving connections open, but I did find a workaround for our particular case: Using the timeout
parameter to make the Redis nodes drop idle clients.
This blog post from last September got me started. Once I was sure we could change the timeout on our nodes without any restart or service interruption we changed it from the default 0
(never expire) to 3600
(one hour), more than enough for our application.
The effect was dramatic, confirming my suspicion that most of these connections were stale.
That dropoff at the end isn't just an incomplete data point - it's the moment we set timeout: 3600
.
I was surprised that there's no expiration by default (on AWS anyway), but it sounds like that's the desired behavior for many applications - especially those using a small connection pool of long-lived connections, where it's very rare that connections are not cleaned up properly. In our case, I'm sure we never need a connection to last more than an hour so this will do for now.
I processed following script with redis-rb(4.2.2) and ruby 2.7.2 to see how GC works. GC seems to disconnect a connection.
Did someone fix it?
require 'redis'
redis = Redis.new
redis.get('foo')
sleep 10 # A connection exists
redis = nil
GC.start
sleep # There are no connection. GC seems to disconnect it.
same here, accumulating until a crash of limit of files!
@hale I'll test your commit at https://github.com/hale/redis-rb/commit/b63b0133a9338637ddc77ff6d176a62030f57b98 soon
@hale just confirmed that with your patch the number of total redis connections is under 100, and before it was skyrocketing to 1000 and then crashed nginx due to files limit.
[braulio@ip-10-122-17-224 ~]$ sudo lsof -a -p 2646 -p 2642 | grep :6379 | wc -l
86
@brauliobo what connector are you using? Hiredis or the regular "Ruby" one?
@byroot redis-rb 4.2.5, should I switch to https://github.com/redis/hiredis-rb ?
Not particularly, but I ask because I've tried to reproduce that problem multiple time with no success, and the previous maintainer tried too without success either.
The patch you used, while I'm happy to believe it works, make very little sense because ultimately all it does is closing a TCPSocket, which Ruby already does when the object is GCed.
That's why I'd like to understand what the problem is really.
Our use case is the following: we use redis-semaphore
to limit concurrency our API clients can request data. For redis-semaphore
to work properly, we cannot use a pool as it causes deadlocks. We need a new redis connection on every API request. That of course generates a lot of redis connections, specially under load
We need a new redis connection on every API request. That of course generates a lot of redis connections, specially under load
Sure, but, I did try to craft some repro scripts creating a lots of new connections like this, and wasn't able to reproduce the problem (short of disabling the GC)
So again unless somehow something holds the raw socket, I don't see how the patch could help. I'll try to take a look at redis-semaphore
when I get a bit of time to see if there's something fishy in it.
So that doesn't help figuring the connection leak, but I just skimmed at https://github.com/dv/redis-semaphore/blob/master/lib/redis/semaphore.rb and I don't see any blocking call or anything else that would requires to have a new connection per request.
Closing this given that the network layer was entirely rewritten in 5.x, so hopefully whatever this bug was is now gone.
If some people still notice this issue with redis-rb 5.x, please let me know and I'll reopen.
I noticed the Redis class makes no attempt to automatically disconnect/quit its client connection when the Redis object is garbage collected and instead appears to force the application to manage that. Is this by design? While applications can of course keep track of this themselves and call Redis#quit when appropriate, it appears that the Redis class can be modified to do that automatically so applications don't have to worry about that and thus not leak Redis connections.
For example:
This can be tested by doing something like
Another option would be add something like this to the Client class instead. Thoughts?