Open sastels opened 1 year ago
PR tested locally. https://github.com/cds-snc/notification-api/pull/1789
Not that currently staging has 61K rows in notifications
and prod has 1.5M so we should do so resigning timings.
If it takes an unacceptable amount of time to resign, we should either (or both)
ok, trying to resign 70K notifications crashed both api and celery pods. Looking at doing it in batches...
@sastels fixed the script to do it in batches and found that 50K was working in staging. To be extra safe, we can size the batch through a command line parameter when executing in production.
We need to test the script for this task to get forward.
Description
As a Notify dev, I need to be able to test our SECRET_KEY rotations
WHY are we building?
want to precisely document the steps for rotating SECRET_KEY and ensure that it's fully tested
WHAT are we building?
Steps for rotating the SECRET_KEY and testing along the way
VALUE created by our solution
We can confidently rotate in production
Acceptance Criteria
Note:
Step 0: Test the system before doing anything
Step 1: Rotate SECRET_KEY Assume that the current SECRET_KEY is K1 or K0,K1 Everything has been signed with K1 in the database and in transit K1 is being used for signing K0 (if it’s there) and K1 are used for verifying
[x] Create a new key K2 using RandomKeygen as discussed above. Because both old containers and new containers will be running during the deployment process we rotate the key in two steps
[x] Merge a PR setting the SECRET_KEY to K2,K1 Old containers (ie not updated yet with new SECRET_KEY) will still sign with K1 and verify with K0,K1 New containers (k8s / celery / api lambda) will sign with K1 and verify with K1,K2 Both old and new containers will be able to verify each others signatures
[x] POST an email and verify that you receive it, and can view it in admin
[x] verify that you can view old emails in admin
[x] verify that the callback was called
[x] Wait until the code is fully deployed (wait at least 10 minutes)
[x] POST an email and verify that you receive it, and can view it in admin
[x] verify that you can view old emails in admin
[x] verify that the callback was called
[x] Merge a PR setting the SECRET_KEY to K1,K2 K2 will still be used for signing K1 and K2 will be used for verifying Old containers will still be signing with K1 and verifying with K1,K2 New containers will start signing with K2 and verifying with K1,K2 Both old and new containers will be able to verify each others signatures
[x] POST an email and verify that you receive it, and can view it in admin
[x] verify that you can view old emails in admin
[x] verify that the callback was called
[x] Wait until the code is fully deployed (wait at least 10 minutes)
[x] POST an email and verify that you receive it, and can view it in admin
[x] verify that you can view old emails in admin
[x] verify that the callback was called
Step 2: Resign database fields The database fields for old records have previously been signed with K1. New records are now being signed with K2
This will verify the fields using K1,K2 The fields will be resigned with K2
Bonus Step Locally:
QA Steps