Add unwind safety - Githubissues

sgrif commented 5 years ago

Currently Pool does not implement UnwindSafe or RefUnwindSafe. This is due to Condvar not implementing it in the standard library. That type probably should implement it, but Pool shouldn't be UnwindSafe prior to this commit anyway. antidote::Mutex incorrectly implements UnwindSafe, when it should not as it removes the poisoning mechanism that makes Mutex be UnwindSafe in the first place.

Ultimately, prior to this commit, Pool<M> should only be UnwindSafe if M: UnwindSafe and M::Connection: UnwindSafe. The need for that bound on M::Connection is because we return the connection to the pool on panic, even if it's in a potentially invalid state.

This commit adds explicit implementations of UnwindSafe and RefUnwindSafe, and also removes the need to bound that on the connection being UnwindSafe by only returning a connection to the pool if it was not dropped during a panic. This ensures that we don't end up in a situation where a connection is potentially returned to the pool in a state where is_broken would return true, but it is not in an expected state (e.g. having an open transaction). It also means that the connection can be expected to be dropped if a panic occurs while it is being used (e.g. ensuring the connection is terminated if there was an open transaction).

We still need an UnwindSafe bound on the connection manager, as we can't guarantee that it will be in a reasonable internal state if a panic occurs in one of its methods

Fixes #63 Fixes #31

sfackler commented 5 years ago

At a philosophical level, I have a couple of concerns:

More broadly than this single issue, I think UnwindSafe/RefUnwindSafe are huge wastes of time and mental energy. In 5+ years of writing Rust, they have never provided any positive value to me, and in fact have provided significant negative value! The associated concept of Mutex poisoning has been an enormous pain with e.g. multiple RLS bugs with a panic somewhere in the depths of Racer turning into a total denial of service as some Mutex that's locked on every call gets poisoned. The fix to every single one of these issues I've ever seen has been simply "ignore poisoning with the lock().unwrap_or_else(|e| e.into_inner()) dance". I have literally never seen anything that was better off with a DoS than continuing with the shared state that was exposed to the panic.

The fact that postgres::Connection doesn't implement those traits is a perfect example of that. I'm pretty sure that the connection type is unwind safe, but so few people pay attention to these marker traits it's never even come up as an issue before! On the other hand, raw pointers are UnwindSafe, so basically every type in openssl is UnwindSafe without me having given any thought at all as to whether that's actually the case! Again, this has never come up as a problem in practice as far as I can tell.

In this specific instance, I don't think that dropping all connections returned while a thread is panicking is actually the right thing to do. "Something bad happened, so I guess we'll just throw this resource away just in case it was somehow involved" doesn't seem right to me. The unclosed transaction example brought up in the associated issue just seems like a poor transaction API, and there are multiple potential solutions. You could use an RAII-based API to manage the transaction and roll back in the destructor. This would fix both the panic-in-transaction case, and the early-return-in-transaction case from e.g. a SQL error. We could add CustomizeConnection::on_return and you could make an implementation that issues a rollback "just in case".

I'm open to making this behavior part of the configuration because it seems totally plausible to allow people to opt-into being a bit conservative in this regard. Making it a compile time Cargo feature seems like it's just going to add more complexity than it's worth.

sgrif commented 5 years ago

@sfackler I've taken another attempt at this which I think you'll find more palatable. Another feature I've been meaning to propose is adding public API to remove a connection from the pool. With that API in place, I've added a hook to ManageConnection which can be overridden to do whatever you want with the connection, either restoring whatever invariants need to be restored, or removing the connection from the pool. Can you take another look at this?

sgrif commented 5 years ago

@sfackler Any thoughts on this implementation? If you're good with moving forward on this I will rebase

sfackler / r2d2

Add unwind safety #70