How is the locking mechanism implemented?

geekodour commented 7 months ago

Details

Hi, thanks for creating this fantastic action! I was wondering how is the locking mechanism implemented under the hood. Does the action keep running till the lock is released? (Thinking from billing perspective)

GrantBirki commented 7 months ago

👋 Hey @geekodour! Thanks for opening this issue, and its an excellent question. I'll go into details below to explain how the locking mechanism works under the hood.

Thinking from billing perspective

First off, I have good news. The Action does not keep running until the lock gets released. If it did, that would incur massive costs by using this Action and that is something that no one wants. In fact, this Action only takes about 4 seconds on average to run. So it is highly cost efficient from that perspective.

How does the locking mechanism work under the hood?

When designing the locking mechanism for this Action, I took a very unique approach. When writing this action (initially it was an experiment) I was really stumped about how to implement a lock. In fact, I thought that it probably couldn't be done. Then I had a wacky idea... What if I stored the lock information in a branch in the GitHub repo where the github/branch-deploy Action was being run (ie. username/your-repo).

Okay but wait a second... let me back up and explain why I'm even trying to "store" that lock data somewhere. So here at GitHub we use hubot and chatops to facilitate deployments. Rather than doing .deploy on a pull request (like what this issue does), we might do something like .deploy in a Slack chatroom to trigger a deployment of a pull request containing new features. When we start a deployment, a "lock" is created so that another engineer doesn't accidentally do .deploy a few moments later and rollback our changes or create potential issues. This lock is stateful and so it must be stored somewhere. Examples of where this lock data could be stored might be: redis, a nosql database, a mysql database, or even an S3 bucket. That lock might look something like this:

[
  {
    "repo": "github/cool-project",
    "locked": true
  }
]

So when another engineer goes to run .deploy, we will first iterate over all the locks that exist, and check to see if the project that this engineer is trying to deploy is locked or not. If it is, we let them know to try again later. If not, their deployment is started.

But... how the heck do we do this in GitHub Actions? All runners are ephemeral and anything you might want to do during the execution of an Action is discarded when the workflow finishes. Simply put, there is no state in Actions (generally speaking). Additionally, this Action should be really simple to use and should not assume that the organization installing it has a database setup that the Action could interact with. That would add a lot of complexity and I certainly wouldn't bother if I had stumbled across this project.

And so the idea of stored lock data directly in the repo where the .deploy command was being invoked from popped into my head. After doing some research, I found out that pushing a branch to GitHub was just about the closest thing to an atomic operation you could get. So it should be safe to assume that if two users type .deploy on two separate pull requests at the exact same time, only one should ever get the lock. I tested this extensively and never ran into a single issue. Then we deployed this Action internally at GitHub with a few projects and also never ran into an issue. Then this project was adopted by hundreds of people in the open source community, governments, and even massive package managers... all without a single issue with the lock methodology. So it works and has been battle tested, nice!

Breaking it Down

Here is it is broken down under the hood:

A user comments .deploy on a pull request and the branch-deploy workflow kicks off
A new "lock" branch is created that contains metadata about the lock

Navigating to the lock branch, you can see that a file called lock.json now exists at the root of the repo (ex: https://github.com/<org>/<repo>/blob/production-branch-deploy-lock/lock.json. The contents of that lock.json file might look something like this:

{
  "reason": null,
  "branch": "GrantBirki-patch-1",
  "created_at": "2024-03-18T02:27:58.156Z",
  "created_by": "GrantBirki",
  "sticky": true,
  "environment": "production",
  "global": false,
  "unlock_command": ".unlock production",
  "link": "https://github.com/<org>/<repo>/pull/57#issuecomment-2002763455"
}

Subsequent deployments will now check if this lock is present for the given environment. If the lock exists, and the user issuing the .deploy command is not the owner of the lock, their deployment will be rejected.

Hopefully that is enough primer information on how the deployment locks work in this project! If you need additional information, I will provide some links below.

Extra Details

Lock Documentation
Source code:
github/lock - Standalone "lock" Action using this logic discussed here

geekodour commented 7 months ago

Thanks for the wonderful explanation @GrantBirki ! Appreciate it!

github / branch-deploy