awslabs / landing-zone-accelerator-on-aws

Deploy a multi-account cloud foundation to support highly-regulated workloads and complex compliance requirements.
https://aws.amazon.com/solutions/implementations/landing-zone-accelerator-on-aws/
Apache License 2.0
541 stars 431 forks source link

Incorrect metadata on LogArchive Cloudwatch logs #412

Open edhull opened 7 months ago

edhull commented 7 months ago

Describe the bug

Cloudwatch logs are being compressed and consolidated within the aws-accelerator-central-logs-<account>-<region>/CloudWatchLogs S3 bucket/path in the LogArchive account via the Landing Zone Accelerator. However, these logs are not being assigned a metadata/mimetype of gzip when stored. Other logs in the same bucket (such as CloudTrail logs, load balancer logs) are correctly being assigned this metadata.

The impact of this bug is that when attempting to ingest these logs (e.g. https://github.com/aws-samples/siem-on-amazon-opensearch-service) Cloudwatch logs cannot correctly be ingested as there no metadata or file extension to indicate that they should first be decompressed.

To Reproduce

Expected behavior Compressed Cloudwatch logs which are consolidated from Organization accounts should be stored with the correct metadata/mimetype. This should include a Content-Encoding of gzip and Content-Type of application/json.

Please complete the following information about the solution:

nagmesh commented 1 week ago

Hello,

Thank you for your interest in LZA.

Firehose by default puts object with in s3 with Content-Type application/octet-stream.

Currently within s3 destination settings we have to leave it as uncompressed as CloudWatch sends the files with gzip level8 compression. Firehose service released a new feature to add file extension where we can try and add .gz extension. However, we will need to run a full test to see the behavior is same on all regions and partitions before it can be released.

I'll leave this ticket to track the firehose file extension ask. The other alterantives are to