mperham / connection_pool

Generic connection pooling for Ruby
MIT License
1.63k stars 143 forks source link

Connection resets and connection timeouts on threads after fork #175

Closed shayonj closed 1 year ago

shayonj commented 1 year ago

After the release of: https://github.com/mperham/connection_pool/pull/166, I noticed we were making a lot of new connections in our redis cluster. Turns out it was happening because a part of our system forks a process to run a command and relays the data back into the main process, once the forked process finishes. The forked process just runs some internal commands and doesn't boot a rails app or anything like that (like a web server).

From what I understand - the change above is generally a good thing to have, since when servers boot and fork new rails/sidekiq/similar workers, you want to avoid leaky pools.

However, for applications that fork for internal business logic can see their connections shutdown while threads on the main process are performing or trying to talk to Redis result in connection resets. Then the connection pool is wiped out and a thundering herd of new connections (depending on the how many connection pools you have and the size of each pool) is seen as other threads try to re-connect. Some active connections see a "Connection lost" error as well.

While I guess one solution would be to not fork non-rails processes and this issue goes away. I am curious if anyone has thoughts on this?

I also wonder if its cleaner to somehow make the child process re-init the pool and not severe the pool on the parent proc? Alternatively, would it better to introduce a at_fork hook that applications can tap into and accordingly clear the pool if they'd like? I see it was briefly discussed here (https://github.com/mperham/connection_pool/issues/165) also.

Previous issue ref: https://github.com/mperham/connection_pool/issues/165