eddiewebb / circleci-queue

CircleCI orb to block/queue jobs to enforce max concurrency limits
MIT License
74 stars 75 forks source link

Race condition during workflow transitions #26

Closed eddiewebb closed 4 years ago

eddiewebb commented 5 years ago

If a queued job happens to hit API at just the right time, it will not find any running Jobs.

I think calling the workflow api differently can solve this.

Original issue:


So I have a multistep job workflow (as follows)

1) queue 2) build 3) deploy and I have definitely seen the queue step allow a transition on Workflow 2 when going from 2 to 3 on Workflow 1. What kind of data can I grab to help debug?

danpalmer commented 4 years ago

@eddiewebb hey did you make any progress on this? As our release cadence has increased we've started hitting this race condition pretty frequently (happened twice in the last 24 hours), and it's a key piece of our ability to use CircleCI.

Is there any likelihood of this orb being moved into CircleCI support? It's a fairly critical feature for continuous delivery.

eddiewebb commented 4 years ago

Hey, I've been working with our internal team to try and get some basic filtering in our v2,apis but was unsuccessful.

It still should be possible to inspect the workflow state instead of job status, I'll look at this today

eddiewebb commented 4 years ago

@danpalmer I opened #46 to provide a quick fix, it simplies adds redundancy to the check, the idea being enough time will pass between checks that any jobs from the previous workflow will have time to show up on the "running" filter.

You can adjust the confidence parameter to the number of backup checks to make.

If you want to try that before it's released, you can use eddiewebb/queue@dev:46

eddiewebb commented 4 years ago

Marking closed, please let me know if 1.4.0 does not address the behavior.

danpalmer commented 4 years ago

@Eddiewebb thanks for this! I’m off until late Jan so will catch up on testing when I’m back. Thanks for maintaining this!

I hope it makes it into the core feature set soon as strictly ordered deploys are critical to all correct continuous delivery systems that I’ve seen.

tberman commented 4 years ago

I just upgraded, and got this:

master queueable
This build will block until all previous builds complete.
Max Queue Time: 1000000 minutes.
Only blocking execution if running previous jobs on branch: master
Attempting to access CircleCI api. If the build process fails after this step, ensure your CIRCLECI_API_KEY is set.
API access successful
Checking time of workflow: 64c04663-fa40-4a8e-a4ee-75da3bc7bc46
Exited with code exit status 22
CircleCI received exit code 22
eddiewebb commented 4 years ago

Todd, are you using a project level token? 22 is likely the curl command erroring out. The API v2 needs personal tokens

tberman commented 4 years ago

Not sure, just upgraded from 1.1.2 to this, did the token type change between 1.1.2 and now?

eddiewebb commented 4 years ago

Quite a bit has changed since that release. Can you open a new ticket for this issue? I'll take A look this weekend