Closed sj26 closed 7 months ago
Hi @sj26
Thank you for providing this PR. I've added this to our backlog and will review this when priorities allow.
This bit us again today.
Could you please take a look at this. It's quite a small fix, and shouldn't take much of your time.
Hi @sj26
Just a heads up that this is on our list to look at. I still can't give an ETA on when it'll be fully assessed and tested, but we'll make sure to provide any updates on this thread.
Thanks for your patience in the meantime.
Goal
We had an outage today after recently introducing Bugsnag into our codebase because our Resque failure backend was already
Resque::Failure::Multiple
, which this code looks like it handles, except for a small mistake with the operator:The backend is likely to be the
Resque::Failure::Multiple
class, not a sub-class.This meant that when the bugsnag instrumentation code ran and we ended up with:
So a failure was reported to Redis, then Bugsnag, then Redis, then Bugsnag, then Redis, then Bugsnag, and so on, until we got a "stack overflow" error.
For us, we had a worker which needed cleanup during resque boot, which involves reporting a failure, so none of our resque workers would boot or process work, resulting in an outage of all background queue processing.
Testing
The tests here are pretty literal and testing the implementation more than the outcome. It might be possible to refactor them to test the Resque side of things a little more, but that's a little more than I'd like to chew off in this PR.
We are currently using
BUGSNAG_DISABLE_AUTOCONFIGURE
to work around, and would love to get this change merged and released quickly so we can use Bugsnag error reporting for Resque :pray: