Closed benhoyt closed 1 month ago
I think overall this is an improvement, as we will now see panics as they happen, but it may lead to some unexpected panics in production uses of pebble until we identify the cause of the deadlock that spawned this line of inquiry.
@hpidcock The thing is, those "unexpected panics in production uses" have a decent chance of deadlocking the entire Pebble process (if a lock is held with defer.Unlock()
), which is much worse than exiting (in which case the calling process will likely restart it). In any case, we'll chat further this afternoon about the best way forward.
Given we now know panic-recovery isn't the cause of the issue at hand, I'm going to come back to this after reproducing and fixing https://github.com/canonical/pebble/issues/314, but I'll leave this PR open -- I think it's probably a good idea in any case.
For reference, Gustavo was positive about this in a spec review meeting a couple of months ago (or more accurately, didn't love the default net/http behaviour and agreed it would be good to turn it off).
By default, Go's
net/http
recovers panics in theServeHTTP
goroutine and keeps serving. However, if a lock is held when the panic happens, this will mean a complete deadlock and Pebble will go unresponsive. As far as I know, this hasn't happened yet, but we use snippets withoutdefer
like this fairly frequently if it's just a tiny amount of code and defer is awkward:If
st.Foo()
or whatever's between the lock and unlock panics,net/http
will recover the panic but the lock will still be held. And in the case of the state lock that basically means all requests are blocked -- bad!There's no direct way of opting out of this
net/http
behaviour, so you have torecover()
from panics yourself, print the goroutine dump, and exit.I've tested this locally, and it prints the goroutine dump and exits with code 1 on a panic as expected, instead of locking up. It also handles any panic in the handler or middleware, eg in
logit
.