isomerpages / isomercms-backend

A static website builder and host for the Singapore Government
5 stars 1 forks source link

feat: git push force to github on second retry #1313

Closed harishv7 closed 5 months ago

harishv7 commented 6 months ago

Problem

Some incidences of a divergence between EFS and GitHub recently occurred. In reality, we trust EFS as our source-of-truth. Hence, it makes sense for our to automatically recover from divergences by doing a force push to GitHub.

There still exists a case when Ops might edit on GitHub while site users are editing which can cause the pushes to override Ops' changes. For this, a solution will be to enable Ops to lock the repo -> perform the edit -> perform GGS repair using form + unlock affected repo automatically. For now, engs can manually add/remove the lock on DDB

Closes ISOM-947

Solution

On failure to git push, we retry twice. On the second retry, we use a git push --force option to forcefully push EFS commits to GitHub.

Breaking Changes

Screenshots of before and after

To simulate this on staging I created a divergence by editing the repo on GitHub. Screenshot 2024-05-02 at 2 08 38 PM

Further editing on CMS caused no errors and the divergence auto-recovered by taking EFS's state as the source of truth. Screenshot 2024-05-02 at 2 09 11 PM

The commit on GH was overriden by the CMS changes

Tests

harishv7 commented 6 months ago

Hmm, what are your thoughts on just always force pushing since EFS is our source of truth? I think we just need to be very deliberate about ensuring that sites are repaired when users are migrated over to email login (which is already covered in our runbook!) - that way we won't run the risk of accidentally losing user commits which were previous made directly to github.

@alexanderleegs what does "ensuring that sites are repaired when users are migrated over to email login" mean?

kishore03109 commented 6 months ago

@harishv7 I think the form solution should go out with this tho, making the lock manual + deviating from the original flow feels incident prone. Could you just add this in the deployment section that this ticket should be completed together with this?

alexanderleegs commented 6 months ago

Hmm, what are your thoughts on just always force pushing since EFS is our source of truth? I think we just need to be very deliberate about ensuring that sites are repaired when users are migrated over to email login (which is already covered in our runbook!) - that way we won't run the risk of accidentally losing user commits which were previous made directly to github.

@alexanderleegs what does "ensuring that sites are repaired when users are migrated over to email login" mean?

Currently cloning the site is done automatically when agencies are migrated from netlify -> amplify, but we don't always do the github login -> email login step at the same time! This means that it's possible for user content to be lost if we migrate to amplify > user edits site more > we migrate the email login but forget to run the site repair form > user makes edit on new login, which causes a force push. Just something to be aware of, it's already in our runbook to run site repair form after doing an email login migration

linear[bot] commented 6 months ago

ISOM-947 Fix divergence between GitHub and EFS automatically

harishv7 commented 5 months ago

With https://github.com/isomerpages/isomercms-backend/pull/1327, we can now merge this PR in as discussed during incident meetings. Going forward, any divergence between EFS and GitHub will automatically take EFS as the source-of-truth and perform a git push if the normal git push (without force) fails the first time.