chaps-io / gush

Fast and distributed workflow runner using ActiveJob and Redis
MIT License
1.04k stars 104 forks source link

Jobs that eventually succeed will not queue downstream jobs #61

Closed jdreic closed 5 years ago

jdreic commented 5 years ago

When a job throws an exception, Gush marks it as failed and reraises the error. Job frameworks like Sidekiq retry failed jobs by default. When a job has failed but later succeeds, since Gush marked the job as failed from the first run, downstream jobs that depend on it won't run.

It would be great if there was an option to unset the failed status if a job re-runs, so that if it succeeds, the downstream workflow can continue.

pokonski commented 5 years ago

Hey @jdreic, thanks for the report. This sounds like a bug!

theo-delaune-argus commented 5 years ago

Hi,

The problem occurs in the job parameters. If job first launch failed, it set @failed_at = time of fail. But when job is replayed, @failed_at is not resetted to nil.

If you take look at https://github.com/chaps-io/gush/blob/master/lib/gush/worker.rb#L73, when children jobs are enqueued, it checks if parent has succeeded. Since the parent job still had the @failed_at non-nil, the children are not enqueued.

This is my fix :

# Override Gush::Job start!
# To ensure failed_at is reset when job is relaunched by Sidekiq
class BaseJob < Gush::Job
  def start!
    super
    @failed_at = nil
  end
end
mickael-palma-argus commented 5 years ago

Hi,

I prefer using refinements 😄

# Temporary fix: https://github.com/chaps-io/gush/issues/61
# Overrides Gush::Job start! method.
# Resets failed_at variable when Sidekiq reloads a job.
module GushJobFix
  refine Gush::Job do
    def start!
      super
      @failed_at = nil
    end
  end
end

class MyJob < Gush::Job
  using GushJobFix

  def perform
    ...
  end
end
pokonski commented 5 years ago

Hey guys! Thank you for finding the culprit! Will release a fix ASAP!

pokonski commented 5 years ago

@mickael-palma-argus @theo-delaune-argus @jdreic this is now released as 2.0.1, thanks again for the report and help with identifying the issue :heart:

/cc @devilankur18 @hqm42 @vadshalamov

mickael-palma-argus commented 5 years ago

You rock 👍 Great specs BTW 😃