eddiewebb / circleci-queue

CircleCI orb to block/queue jobs to enforce max concurrency limits
MIT License
74 stars 75 forks source link

[1.7.0] Orb not working properly when pipeline uses `path-filtering` orb #95

Closed axilleastr closed 1 year ago

axilleastr commented 1 year ago

Orb version

orbs:
  queue: eddiewebb/queue@1.7.0

What happened

We are using the path-filtering orb, which under the neath utilizes the continuation orb, and dispatches workflows through CircleCI API based on paths of changed files.

Ultimately, for those not familiar with it, in practice, that works as follows:

  1. A commit triggers a new CircleCI pipeline which starts with an initial workflow (ie generate-config). The generate-config workflow, uses the path-filtering orb, and is responsible to dispatch the creation of some follow up workflows based on files changed, with the continuation orb.
  2. The dispatched workflows take place, and run until completion/failure under the same pipeline context.

A nice example can be found here as well.

It seems that for a workflow to be blocked, comparison happens based on this expression:

if [[ ! -z "$my_commit_time" ]] && [[ "$oldest_commit_time" > "$my_commit_time" || "$oldest_commit_time" = "$my_commit_time" ]] ;

where,

Normally, this works fine with workflows that are triggered upon a pipeline is being created. However, that does not work well with pipelines that rely on the path-filtering and continuation orb, since in that case, workflows are dispatched afterwards, and their creation time does not really reflect the "received at" time for the pipeline trigger.

As a result, while a path-filtering is still in progress and not completed yet - new workflows not dispatched yet, another commit (or more) can trigger a new pipeline in between, which creates a new generate-config workflow before the creation of the dispatched workflows of the previous pipeline. So, despite the fact that this workflow belongs to newer pipeline, it wins on comparison with workflows that belong to older pipelines/builds but have been created afterwards and is not being blocked.

It seems that this logic introduced long time ago, where orb switched to use the workflow created_at field for the comparison, instead of the committer_date field. As explained above, in cases where workflows are dispatched afterwards, this field does not really correlate with the time that a pipeline got triggered.

To help you understand better the problem, imagine the following scenario, which summaries the race condition mentioned above:

Our goal is to use the `circleci-queue` orb, and make sure that the `generate-config` workflow is properly 
queued/blocked until there are no workflows in progress, from previous pipelines. 
End goal is to have CircleCI/path-filtering orb dispatching any set of follow up workflows based on their 
commit order, and having only one commit with these workflows running at any time 
(a commit dispatches workflows based on path filtering only when workflows from the previous commit have been completed)

For simplicity reasons, let's imagine that there are no other pipelines/workflows in progress at that time, and events 
are ordered below based on their time.

* Commit `abcd` triggers a CircleCI pipeline with a `generate-config` workflow at `2023-03-21T00:00:00.000Z`. 
There are no other running workflows, so workflow is not being blocked from orb. 
At that time, both `oldest_commit_time` and `my_commit_time` point to `2023-03-21T00:00:00.000Z`.

* Commit `wxyz` triggers a CircleCI pipeline with a `generate-config` workflow at `2023-03-21T00:01:00.000Z`. 
At that time, `oldest_commit_time` points to `2023-03-21T00:00:00.000Z` and `my_commit_time` points to `2023-03-21T00:01:00.000Z`. 
This workflow will be blocked up until the above workflow, and any other workflow with older creation time is finished.

* The `generate-config` workflow, triggered by `abcd` commit, has completed and a new set of workflows are created at `2023-03-21T00:02:00.000Z`. 
At that time, `oldest_commit_time` points to `2023-03-21T00:01:00.000Z` and `my_commit_time` points to `2023-03-21T00:02:00.000Z`.

* The `generate-config` workflow, triggered by `wxyz` commit, is being unblocked, completes and dispatches another new set of workflows at `2023-03-21T00:04:00.000Z`. 
Normally, we would expect that the this workflow would be blocked up until the dispatched workflows from `abcd` commit are completed. 
However, this is not the case, since at that time, both `oldest_commit_time` and `my_commit_time` point to `2023-03-21T00:01:00.000Z`.

Expected behavior

The queue orb is able to block workflows, based on their commit order and does not rely on the workflow creation time, which can lead to race conditions as described above.

I am curious if we can fallback to another metadata field for the oldest_commit_time variable.

Also, currently, the name of this variable does not really reflect its real value, since it does not hold any information about the commit time, or the time around when a pipeline got triggered, and instead points to the workflow creation time.

I see that the motivation for this implementation was this issue: https://github.com/eddiewebb/circleci-queue/issues/22, where some issues reported for Orb not working properly on BitBucket when using the committer_date attribute, but maybe makes sense to re-investigate whether another metadata field exists that can be used and satisfy both cases.

Thoughts?

eddiewebb commented 1 year ago

This is now available in 2.2.1+

axilleastr commented 1 year ago

This is now available in 2.2.1+

Thank you @eddiewebb 🚀