In a very specific combination of events, any event emission can deadlock if it tries to add stuff to the Hub, like breadcrumbs. This can be easily reproduced with the following setup:
sentry-log (or sentry-tracing if tracing-log is set up)
event filtering configured to generate breadcrumbs for debug logs
any cause for events to not be sent to the server
Should the event fail to be sent for too long, the channel to the sender thread fills up and events start to be dropped. This will generate a debug log line, logging that an event has been dropped and why. With sentry-log (or tracing-log + sentry-tracing), this would generate a breadcrumb.
However, the whole call to Transport::send_envelope is done in the context of HubImpl::with, that holds a read lock on the HubImpl's stack. When generating the breadcrumb for the debug line, we end up calling Hub::add_breadcrumb, which calls HubImpl::with_mut to get a mutable reference to the top of the stack. However, since we already have a read lock on the stack, we deadlock on the write lock.
The fix is to move the call to Transport::send_envelope outside of the lock zone, and we use HubImpl::with only to clone the top StackLayer. Since this structure is only Arcs, the only performance hit is two Arc clones and not the whole stack cloning.
We hit this in prod because (for reasons) we had to backport the legacy pre-envelope transport, and this one uses the warning level when the channel is full. The default log filter is safe from this deadlock because debug events are dropped, but anyone enabling at least breadcrumbs for debug events are exposed to it.
In a very specific combination of events, any event emission can deadlock if it tries to add stuff to the Hub, like breadcrumbs. This can be easily reproduced with the following setup:
sentry-log
(orsentry-tracing
iftracing-log
is set up)Should the event fail to be sent for too long, the channel to the sender thread fills up and events start to be dropped. This will generate a debug log line, logging that an event has been dropped and why. With
sentry-log
(ortracing-log
+sentry-tracing
), this would generate a breadcrumb.However, the whole call to
Transport::send_envelope
is done in the context ofHubImpl::with
, that holds a read lock on theHubImpl
's stack. When generating the breadcrumb for the debug line, we end up callingHub::add_breadcrumb
, which callsHubImpl::with_mut
to get a mutable reference to the top of the stack. However, since we already have a read lock on the stack, we deadlock on the write lock.The fix is to move the call to
Transport::send_envelope
outside of the lock zone, and we useHubImpl::with
only to clone the topStackLayer
. Since this structure is onlyArc
s, the only performance hit is twoArc
clones and not the whole stack cloning.We hit this in prod because (for reasons) we had to backport the legacy pre-envelope transport, and this one uses the warning level when the channel is full. The default log filter is safe from this deadlock because debug events are dropped, but anyone enabling at least breadcrumbs for debug events are exposed to it.
Thanks!