Open jimleroyer opened 1 year ago
@sastels to QA
We do still have a few staging app-related logs that have retention less than a year. Possibly we don't care about them?
Ben to review Steve's findings and provide Final Judgement on these. (poor souls)
The 3 lambda ones are because they were part of the terraform module. I've updated the terraform modules repo to allow customizing this setting: https://github.com/cds-snc/terraform-modules/pull/345
The batch saving log group is not in terraform. I've added it in and set it appropriately. The PR will require manual imports before merging
New PR created that splits sensitive and non-sensitive log retention periods. Will get the team to review today.
Need to verify what happened yesterday
Need to verify what happened two days ago
Not sure what I needed to verify before... Lesson learned to be more descriptive.
Can confirm that this has been released and is ready for QA.
Steve will take a look!
/aws/lambda/ses_to_sqs_email_callbacks and /aws/lambda/sns_to_sqs_sms_callbacks are both one week as expected for PII
/aws/lambda/ses-receiving-emails doesn't exist in prod and is empty in staging so maybe it's just some old thing that's not used anymore? :/
For the BatchSaving log group I see retention "Never expire" in staging / prod - is this expected?
Probably just needs a terraform variable set - Ben to investigate!
Ben to actually investigate today.
Ran into operations issues, will try and get to this today.
I verified that the callbacks are as expected.
The ses-receiving-emails was empty in staging and didn't exist anywhere else including TF. I deleted the log group in staging.
BatchSaving - I started looking but then had to switch to production release troubleshooting. Will get to this today.
@sastels to QA
@sastels to QA today!
today. I swear. 95%.
The BatchSaving
log group still is "never expire" but it just has metrics in it, and only goes back to 2023-12-04 so maybe it's expiring itself and is fine as it is?
In prod the retention period is configured to 0, so indefinite so that's expected.... unless this has sensitive info in it? In which case I can change it to the 7 day retention.
The retention period is set correctly in staging, at 12 months... as to why there are only entries from december - that I'm not sure of..
makes sense!
Description
As an ops lead or GCNotify developer, I would like to increase the data retention to 1 year, So that I can properly come back on issues of the past, assess and fix.
At the moment, the retention period for some log group is set to 1 month which is too small for our needs. For example: /aws/containerinsights/notification-canada-ca-staging-eks-cluster/application is set to 1 month.
WHY are we building?
To debug issues that are older than a month.
WHAT are we building?
Increased log capacity in STAGING environment.
VALUE created by our solution
More debugging capacity.
Acceptance Criteria
QA Steps