chaps-io / gush

Fast and distributed workflow runner using ActiveJob and Redis
MIT License
1.04k stars 104 forks source link

After gets called n-times #46

Closed matti closed 6 years ago

matti commented 6 years ago

Tried the README's example:

class SimpleWorkflow < Gush::Workflow
  def configure
    run FirstDownloadJob
    run SecondDownloadJob

    run SaveJob, after: [FirstDownloadJob, SecondDownloadJob]
  end
end

SaveJob will be invoked twice:

2017-11-04T15:51:12.522Z 5060 TID-ovvcwd854 Gush::Worker JID-f3bddd0cf26cd11492d97a6b INFO: start
2017-11-04T15:51:12.527Z 5060 TID-ovvcwd95s Gush::Worker JID-beda8811a9c7976ef33abd41 INFO: start
SecondDownloadJob#perform {}
FirstDownloadJob#perform {}
2017-11-04T15:51:12.569Z 5060 TID-ovvcwcmx8 Gush::Worker JID-18a48a9a6ce45b6aee78bb35 INFO: start
2017-11-04T15:51:12.571Z 5060 TID-ovvcwd854 Gush::Worker JID-f3bddd0cf26cd11492d97a6b INFO: done: 0.049 sec
2017-11-04T15:51:12.580Z 5060 TID-ovvcwd9o4 Gush::Worker JID-b98176b881db8a61e8f7cdb9 INFO: start
2017-11-04T15:51:12.586Z 5060 TID-ovvcwd95s Gush::Worker JID-beda8811a9c7976ef33abd41 INFO: done: 0.059 sec
SaveJob#perform {}
2017-11-04T15:51:12.597Z 5060 TID-ovvcwcmx8 Gush::Worker JID-18a48a9a6ce45b6aee78bb35 INFO: done: 0.029 sec
SaveJob#perform {}
2017-11-04T15:51:12.601Z 5060 TID-ovvcwd9o4 Gush::Worker JID-b98176b881db8a61e8f7cdb9 INFO: done: 0.02 sec
matti commented 6 years ago

Same with:

class SimpleWorkflow < Gush::Workflow
  def configure
    run FirstDownloadJob, before: SaveJob
    run SecondDownloadJob, before: SaveJob

    run SaveJob
  end
end
matti commented 6 years ago

actually... ...this is intended behaviour? Is there a way to call SaveJob just once?

matti commented 6 years ago

so that both DownloadJobs would be completed before

pokonski commented 6 years ago

It definitely should only trigger the SaveJob only once. Will reproduce and fix, because this is a regression :(

On Nov 4, 2017 16:58, "Matti Paksula" notifications@github.com wrote:

so that both DownloadJobs would be completed before

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/chaps-io/gush/issues/46#issuecomment-341907968, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIQ6k4MRk4vNI5cp5qJKIjFXa6IRCNaks5szImMgaJpZM4QSDbU .

matti commented 6 years ago

Actually, I'm not sure if this behaviour is a bug - it follows the diagram nicely - each notification job triggers another job... But yes, this was the main use case for using gush for me.

pokonski commented 6 years ago

Yeah this is unintended, it should only trigger the job if all of its dependencies have finished successfully. So only the second check should see that the other job finished and only then should enqueue the SaveJob. I suspect the state is incorrectly persisted along the way.

On Nov 4, 2017 17:01, "Matti Paksula" notifications@github.com wrote:

Actually, I'm not sure if this behaviour is a bug - it follows the diagram nicely - each notification job triggers another job... But yes, this was the main use case for using gush for me.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/chaps-io/gush/issues/46#issuecomment-341908217, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIQ6ng67hnNXXlSNaB-U_KbhCf4n5H7ks5szIpjgaJpZM4QSDbU .

matti commented 6 years ago

clarification: yes, it's a bug - but also an interesting feature that kinda follows the diagram.

pokonski commented 6 years ago

Ha, maybe that also needs a configuration option :D

On Nov 4, 2017 17:13, "Matti Paksula" notifications@github.com wrote:

clarification: yes, it's a bug - but also an interesting feature that kinda follows the diagram.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/chaps-io/gush/issues/46#issuecomment-341909094, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIQ6vHTceEHCKBQGR9m5PItMZqCKee6ks5szI0ygaJpZM4QSDbU .

dmitrypol commented 6 years ago

I also experienced this behavior in my tests

erated commented 6 years ago

Any updates about this? We have encountered this as well.

tanin commented 6 years ago

Hi, is there any news about this bug?

pokonski commented 6 years ago

@erated @tanin haven't had a chance to investigate, but can you share your minimal case to reproduce, too? Will help tracking it down.

matti commented 6 years ago

@pokonski isnt my original example in the beginning of thia thread enough?

pokonski commented 6 years ago

@matti it's enough, I reproduced it in specs but seems to be caused by a race condition when using an async adapter for ActiveJob.

pokonski commented 6 years ago

Alright, master now contains a fix for this issue @dmitrypol @matti @tanin. Please let me know if it helps!

This should only be an issue for jobs finishing almost at the same time, most apparent in tests because usually the jobs are empty and execute close to each other. I added a mutex so only one of the ancestors can en-queue his descendant at any given time while others have to wait.

matti commented 6 years ago

But atleast I didn't use ActiveJob. let's test this..

pokonski commented 6 years ago

@matti it's not actually related to ActiveJob, it's simply caused when jobs are running in parallel and both trigger their descendant veeery close to each other. Now I put a lock so triggering new jobs is run one at a time, solving the race condition.

lastobelus commented 6 years ago

@matti it's not actually related to ActiveJob, it's simply caused when jobs are running in parallel and both trigger their descendant veeery close to each other. Now I put a lock so triggering new jobs is run one at a time, solving the race condition.

is this bug fixed then?

matti commented 6 years ago

@pokonski ?

pokonski commented 6 years ago

@lastobelus Yes, fixed in https://github.com/chaps-io/gush/commit/49000a30fd34ac21cabdc7d2c24ffbfa3b113bb9