awslabs / landing-zone-accelerator-on-aws

Deploy a multi-account cloud foundation to support highly-regulated workloads and complex compliance requirements.
https://aws.amazon.com/solutions/implementations/landing-zone-accelerator-on-aws/
Apache License 2.0
537 stars 430 forks source link

Issue with cloudwatchLogRetentionInDays parameter #516

Open senyberg opened 2 months ago

senyberg commented 2 months ago

Describe the bug Documentation for parameter cloudwatchLogRetentionInDays says:

This retention setting will be applied to all CloudWatch log groups created by the accelerator. Additionally, this retention setting will be applied to any CloudWatch log groups that already exist in the target environment if the log group's retention setting is LOWER than this configured value.

But it seems like it does not update all log groups that exist, only current ones. For example, our setup has two log groups for identity center (not sure why): /aws/lambda/AWSAccelerator-IdentityCe-IdentityCenterInstanceId-xyz /aws/lambda/AWSAccelerator-IdentityCe-IdentityCenterInstanceId-abc

Only the latter has been updated from 10 years to 12 months (which is the new value).

To Reproduce Update cloudwatchLogRetentionInDays parameter to a lower value than currently, see that not all log groups are updated.

Expected behavior According to documentation, all log groups (old and current) should be updated with new value.

Please complete the following information about the solution:

rycerrat commented 2 weeks ago

Hi @senyberg,

The root cause for the differences in the log groups seems to be the log group retention explicitly set by the Landing Zone Accelerator vs log group retention set by the underlying CDK. What I mean by this, is that the LZA utilizes the Custom Resource Provider framework (https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.CustomResourceProvider.html). Using this framework creates 2 lambda functions, one is which a provider lambda function, the second which actually implements the underlying lambda function.

From what I see, the Provider Lambda functions don't seem to have their retention periods adjusted, while the actual Custom Lambda functions do. I believe this is why in the logs there are two Lambda functions with similar names with seemingly different behavior.

Unfortunately, this means that you will likely need to update log-group retentions manually in order to rectify this issue.

From the LZA side, I am more than happy to submit an issue internally to either make the documentation of this feature more clear, or to try and see if we can handle the underlying Lambda retentions as well.

Thanks for your patience.

senyberg commented 1 week ago

Hello @rycerrat ,

Thanks for the explanation. I don't exactly understand what the "Provider Lambda function is", but this is probably not that important. I understand the reason for this.

I can confirm that for new account provisioned with a lower cloudwatchLogRetentionInDays it seems to be set correctly (almost). There seems to be two log groups that are set as "Never expire":

These seem to be consistent for new accounts.