isomerpages / isomercms-backend

A static website builder and host for the Singapore Government
5 stars 1 forks source link

IS-412: Move fetching SSM params to prebuild #943

Closed harishv7 closed 1 year ago

harishv7 commented 1 year ago

Problem

Right now we face an issue with fetching envs from SSM:

Closes IS-412

Solution

This solution introduces a few changes:

Why store on EFS?

When multiple instances boot up, the pre-deploy step executes on each of them. This is fine on normal deployment. However, consider the edge case where we want to update envs on SSM. To trigger a rebuild, we need to re-deploy from GH Actions.

However, this takes long due to our the rolling deploy. Instead, for urgent cases, we can expedite by directly running this script and doing a "Restart App Instances" on EB.

EFS enables this as you just need to run the script once instead of once on each instance. However, in the event where multiple instances are booted up at the same time, there will be concurrency issues in overwriting the .env file leading to inconsistent states. To prevent this, each instance writes to a local file and then transfers it to the EFS. A simple locking mechanism is present in the script to lock the folder we are copying into to again prevent a clash bet the 2 instances.

Note that if a param is present on script but not on SSM, current behaviour is to skip this and proceed to next param. While exiting the script and causing the deploy to fail might be a good practice of failing early, this PR's aim is to first achieve parity with our current behaviour of having convict as the checking layer (though this happens at runtime after deploy stage).

How locking works?

Here's what happens:

  1. The file descriptor 200 is associated with the file /efs/isomer/.isomer.lock.
  2. If /efs/isomer/.isomer.lock doesn't exist, it's created. If it already exists, it's just opened.
  3. flock then tries to acquire a lock on the file descriptor 200, which is linked to the .isomer.lock file.
  4. If flock can't acquire the lock (because some other process has it), then the script exits due to || exit 1.

Benefits

Cons

Breaking Changes

Tests

Tests ran as part of checks for PR:

Deploy Notes

Ensure ALL envs are moved to SSM with appropriate prefix.

New dependencies: