mperham / connection_pool

Generic connection pooling for Ruby
MIT License
1.63k stars 143 forks source link

Non-local return forces disconnect #75

Closed mperham closed 9 years ago

mperham commented 9 years ago

The changes in #67 have resulted in unnecessary connection discards when using an explicit return:

pool = ConnectionPool.new
pool.with do |conn|
  return true
end

The explicit return means that the success = true line within with is not executed.

mperham commented 9 years ago

I can't figure out a way to handle the expected case of the non-local block return above, while also handling the unexpected non-local, uncatchable Timeout::Error.

:question: :confounded:

mperham commented 9 years ago

I'm not sure if Thread#handle_interrupt can help us here. http://ruby-doc.org/core-2.2.1/Thread.html#method-c-handle_interrupt

mperham commented 9 years ago

@tamird Any brilliant ideas?

tamird commented 9 years ago

@mperham yeah, Thread#handle_interrupt can help us by replacing the existing solution with uninterruptible ConnectionPool#with. I'm not really sure that's a sane thing to do, but the implementation should be a straight port of https://github.com/brianmario/mysql2/pull/592

mperham commented 9 years ago

Does that patch makes mysql2 impervious to Timeout.timeout, so the Timeout functionality simply doesn't work anymore if the mysql connection hangs?

benlovell commented 9 years ago

That's my understanding. :see_no_evil:

mperham commented 9 years ago

At this point I'm ready to rollback this entire set of changes and call this a documentation problem: don't use Timeout.timeout if you want your code to work reliably.

tamird commented 9 years ago

Seems reasonable to me. Heads up for the homies @ggilder @zanker

mperham commented 9 years ago

Ok, please review the code and let me know. I use handle_interrupt to avoid leaking connections in the case where a timeout occurs right as the connection is being checked out.

In thinking about this more, it seems like it is up to each connection to ensure its own safety in the face of Timeout, just as mysql2 did. Perhaps a fix to the redis gem is in order.

zanker commented 9 years ago

Ruby :(

Yea there's not much choice. Only way I know of to get around that is to wrap the calls in an ensure in another method, but you lose the results of return which makes it useless.

zanker commented 9 years ago

I'll leave @tamird to comment on the specifics, but JRuby 1.7.x does not support Thread.handle_interrupt. Whatever you do will have to check for the existence of the method.

mperham commented 9 years ago

@tamird How's that test look?

tamird commented 9 years ago

@mperham doesn't look like it tests that checkout (or checkin, for that matter) is protected from timeouts

mperham commented 9 years ago

Better?

tamird commented 9 years ago

Better, yeah. What's the point of the pool size assertion in that test? the "real" checkout happens before the sleep, so the assertion isn't validating anything not validated in the previous test

tamird commented 9 years ago

Also 1.9.3 and JRuby failed in CI as expected, that test probably needs to be marked pending if Thread doesn't respond_to handle_interrupt

mperham commented 9 years ago

Final call, this is ready to ship IMO. Any other comments?

ryansch commented 9 years ago

Is there anything dangerous about blocking all exceptions instead of just Timeout::Error?

Edit: And Timeout::ExitException apparently.

mperham commented 9 years ago

@ryansch I wish I knew. The semantics of handle_interrupt are not well documented at all. What happens if a Timeout occurs during a :never block? Is it silently discarded or delayed and raised once allowed? (Answer: it looks like the latter)

Should I just handle Timeout::ExitException?

ryansch commented 9 years ago

I'm unclear on when ExitException is raised as opposed to Error. The docs (http://ruby-doc.org/core-2.2.1/Thread.html#method-c-handle_interrupt-label-Guarding+from+Timeout-3A-3AError) go a different route and I don't understand ruby internals well enough to grok the :on_blocking option.

ryansch commented 9 years ago

The timeout.rb implementation is confusing at best. It appears that ExitException is meant to be a wrapper around the exception raising that ensures it's on the right thread. It uses throw/catch to do the dirty work. @tamird: Did you find that handle_interrupt caused that process to break down? I don't have enough time to attach a debugger and completely understand this code.

Mike: I was concerned about other exceptions but a more careful read of the docs suggest that handle_interrupt only applies to async errors from Thread.kill or Thread.raise. Would this be a problem when sidekiq wants to shutdown?

tamird commented 9 years ago

Break down? No. In the end, Thread#raise does the dirty work and

handle_interrupt just defers the interrupt. @mperham's tests demonstrated

this, when I last looked a few hours ago.

I don't know what sidekiq does at shutdown that is relevant to any of this. On Apr 10, 2015 6:18 PM, "Ryan Schlesinger" notifications@github.com wrote:

The timeout.rb implementation is confusing at best. It appears that ExitException is meant to be a wrapper around the exception raising that ensures it's on the right thread. It uses throw/catch to do the dirty work. @tamird https://github.com/tamird: Did you find that handle_interrupt caused that process to break down? I don't have enough time to attach a debugger and completely understand this code.

Mike: I was concerned about other exceptions but a more careful read of the docs suggest that handle_interrupt' only applies to async errors from Thread.killorThread.raise`. Would this be a problem when sidekiq wants to shutdown?

— Reply to this email directly or view it on GitHub https://github.com/mperham/connection_pool/pull/75#issuecomment-91707053 .

mperham commented 9 years ago

@ryansch For the most part, all bets are off when a Sidekiq Thread is shutting down "hard" via raise/kill. The process is going away so it doesn't really matter.

ryansch commented 9 years ago

To be clear @tamird, I was trying to figure out why we couldn't just ignore Timeout::Error instead of seeing Timeout::ExitException.

I'd be interested in using the following:

Thread.handle_interrupt(Timeout::Error: :never, Timeout::ExitException: :never) do
  ...
  Thread.handle_interrupt(Timeout::Error: :immediate, Timeout::ExitException: :immediate) do
    ...
  end  
end

:on_blocking is interesting as it sounds like it won't raise in client code unless the thread is asleep on IO (ex: select) but I don't believe that's the normal execution model for timeout.rb and might be surprising in client code.

ryansch commented 9 years ago

Ok I think I understand why we can just ignore Timeout::ExitException. It prevents the other thread's raise from working: https://github.com/ruby/ruby/blob/v2_2_1/lib/timeout.rb#L86

The end result is that the timeout method always returns on line 89 (as it's a Proc) and never has a chance to raise Timeout::Error.

tamird commented 9 years ago

LGTM (might want to squash). I'd probably do something about that failing rubinius test, annoying to always have a failure.

mperham commented 9 years ago

Thanks. I never squash, rebase or rewrite history. Saves me from ever having git issues.

Unsure what to do about Rubinius. I don't see how that failure's not a bug on their part.

zmoazeni commented 9 years ago

For what it's worth, this is what I was trying to debug in https://github.com/mperham/connection_pool/pull/67#discussion_r27772882