meshcloud / gate-resource

A generic quality-gate resource for Concourse CI
Apache License 2.0
23 stars 6 forks source link

Add possibility to "reset" autoclose gates #2

Open iverberk opened 5 years ago

iverberk commented 5 years ago

I'm trying to implement a use-case where I put an autoclose gate and wait for all the dependent items to pass. For the next run of my deployment pipeline, I want to re-enable the same autoclose gate and have the exact same logic. Currently it will just ignore the new autogate because the .autoclose extension has been removed and the file is already there. Instead of generating a new file every time I'd like to reuse the same autoclose gate.

JohannesRudolph commented 5 years ago

Hi, great to hear it’s of some use to you.

The way we model is that for each deploy we have a unique gate id, typically derived from the git rev. That also makes things a little easier to trace down in concourse should things go wrong.

What might work is if you delete the gate file from the gate repo first before scheduling a new deploy. This can be done out of band (ie not involving gate resource).

Does any of that sound plausible to you? Happy to learn more about your use case.

Ivo Verberk notifications@github.com schrieb am So. 12. Mai 2019 um 12:49:

I'm trying to implement a use-case where I put an autoclose gate and wait for all the dependent items to pass. For the next run of my deployment pipeline, I want to re-enable the same autoclose gate and have the exact same logic. Currently it will just ignore the new autogate because the .autoclose extension has been removed and the file is already there. Instead of generating a new file every time I'd like to reuse the same autoclose gate.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Meshcloud/gate-resource/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/AAA7YN326LF5HHEU2WE2N4LPU7Y4ZANCNFSM4HMKCYZA .

iverberk commented 5 years ago

Hi Johannes,

I was just thinking about this a little more. Maybe it makes more sense to create a new autoclose gate for every version that we deploy. I was just wondering about the amount of files that eventually end up in the git repo.

I agree with you that it might make more sense to have a proper trail than to keep overwriting the same thing (although it is in the git history).

Our use case is this:

  1. We have a number of target accounts that we deploy a Kubernetes platform to. This is a deployment pipeline that consists of several jobs.
  2. The accounts are split into non-prod and prod accounts, effectively creating stages of roll-out.
  3. We want to have a mechanism that we start with the non-prod pipelines and wait for all of the to finish successfully before continuing with the prod pipelines.
  4. The idea is to create an autoclose gate for each stage and have it wait for all the dependent pipelines to pass.
  5. Every pipeline would write its own gate and eventually, after all the non-prod pipelines have finished, the autoclose gate would trigger the next stage.
  6. We were thinking of just resetting the gate for the next deployment because the logic is the same each time and it seems wasteful to create a new autoclose file for each deploy.
  7. However, resetting (removing the autoclose file) and putting the same dependent objects back in there would just immediately close the gate. We actually want the new gate to wait for new versions of all the pipeline gates. But this also means to write new gates for each pipeline. We can incorporate the semver into the gate name to enforce this.

So I guess we are better off just creating new versions of each pipeline gate and writing a new autoclose gate that matches all the specific pipeline gates for that version. Do you agree or is there a better way to make this process work?

Thanks!

JohannesRudolph commented 5 years ago

I was just wondering about the amount of files that eventually end up in the git repo.

Yes, that's definitely an issue. Also contention on the repo. At the end of the day we do abuse git as a distributed k/v store, which is not exactly what it was built to be :-) Our gate repo has accumulated a few 100Ks of commits certainly things go slower as they should be. However, gates are only a tiny portion of our total build time so it's not such a big deal. One tiny thing we do though is to run a script to regularly delete "orphaned" autogates (e.g. feature branch builds that failed and won't be needed anymore).

We built gate-resource as a workaround to our immediate scaling problems around a mono-repo build & deploy pipeline, so we cut some corners intentionally. Maybe concourse will provide better support for a scenario like ours in the future.

Your use case and intention sound very legit. Unless you deploy hundreds of versions each day you should not have any immediate scaling issues around gate resources. If all you want to have is a distributed lock you can take a look at https://github.com/concourse/pool-resource, which heavily inspired gate-resource but turned out to be too simplistic for what we needed.

iverberk commented 5 years ago

I think the pool resource is too simplistic for our use-case too, unless you see an option for our use-case? We know up-front how many pipelines we need to wait for, but how do we check that all have completed? I don't see a way to do that with the pool resource. Your resource seems more suited to this. We do use the pool resource to batch our deployments and lock the entire process until all stages have been successfully deployed.

I think we need to prune our git repo from time to time. We are deploying at most a couple of times a day I think so it won't be too much of a problem but I'd like to be ahead of the curve.

JohannesRudolph commented 4 years ago

Hi, just want to add that I’m working on a 2.0 of gate resource that fixes these concerns.

The first beta uses shallow clones to speed up the gate operations significantly and maintain o(1) performance wrt to the number of commits. One other addition planed is a cleanup script that removes expired gates, see #1