Closed ferusinfo closed 6 years ago
Right now Gush captures all exceptions to internally mark jobs as failed without raising them further to Sidekiq. So as far as Sidekiq is concerned all jobs succeed. This turned out to be rather annoying for developers.
This will change in the 1.0.0 version I plan on releasing Soon(TM).
What is the estimated ETA for the 1.0.0 version? Happy to cooperate, too.
Also, the problem that I've found so far is that even after the flow.reload
the failed workflow is still returning :running
on the flow.status
method.
If everything goes right, this week :)
Hm, it should not be marked as returning. Can you create a separate issue for this?
Sure, let me do this in a second.
I pushed a change which raises errors after marking jobs as failed, so Sidekiq can retry them. If you can have a look, that'd be perfect :)
Even though the job raises an error, the configuration says retries: false. This will cause sidekiq to discard the job immediately.
https://github.com/mperham/sidekiq/wiki/Error-Handling#configuration
For jobs that fail due to transient issues, like being unable to obtain a database connection, this causes the workflow to stall in an unrecoverable way. It's not possible to reload and continue the workflow., or I haven't figured out how to do it correclty.
@carlthuringer what kind of exception are you getting? Gush catches those itself before Sidekiq does and allows retries for the users of Gush (either via CLI or the web gui)
I think I determined my issue. I was expecting Workflow#reload
to actually replace/rebuild the instance, but instead it just returns the loaded instance, so to get a proper status update and continue, you have to flow = flow.reload; flow.continue
.
@pokonsky in case of unavailability of external service I'd prefer to reschedule work later, rather than retry it immediately. Sidekiq has exelant tools addressing such issues - exponential delay before each retry, and callback called after all retries exhausted.
It would be great to support these features in gush.
@bolshakov agreed, I'm actually considering letting them fail instead so sidekiq can handle that.
Version 1.0.0 re-raises the error so it can be retried by backend of your chosing now. Closing :)
@pokonski how would you customise the retry behaviour for Sidekiq + Gush jobs?
We solved this by injecting a rescue_from
into the Gush::Worker
base class in an initializer:
# config/initializers/gush.rb
Gush::Worker.class_eval do
rescue_from(StandardError) do |e|
# Any handling you want to do e.g. report to Sentry/Rollbar/etc
end
end
This stops Sidekiq from retrying jobs in our Gush workflows after the workflows are marked as failed.
How the Gush is handling (or not) errors that might occur in a
Job
class? I've tried to useraise
, but all of the tasks has passed without an error - nothing has showed up either in Sidekiq or Workflow (theflow.status
is returning:running
)Is there any way to integrate Gush with Sidekiq worker handling? Or it is too complicated?