ministryofjustice / modernisation-platform

A place for the core work of the Modernisation Platform • This repository is defined and managed in Terraform
https://user-guide.modernisation-platform.service.justice.gov.uk
MIT License
679 stars 290 forks source link

Protective Monitoring: Replace Firehose logs with s3 #7607

Open SimonPPledger opened 1 month ago

SimonPPledger commented 1 month ago

User Story

We currently provide a number of sets of logs to the XSIAM protective monitoring tool. However they are having issues with the info coming, via firehose, from the following: VPC flow logs Network firewall logs Route 53 resolver logs

This ticket is to provide the same log information into new S3 buckets. To then liaise with the security development team to confirm that they can now ingest these logs correctly

Value / Purpose

This will improve the security of the modernisation platform and enable the SOC to alert us of any issues

Useful Contacts

leonardo.marini@justice.gov.uk

Additional Information

This was originally done using firehose - the original ticket is here https://github.com/ministryofjustice/modernisation-platform/issues/6163

Definition of Done

dms1981 commented 1 month ago

From discussion with Leonardo I think we actually have almost everything in place. We already collate the logs required in S3 (VPC Flow logs, Route 53 Resolver logs, Firewall logs), so all that ought to be required is the SQS setup, notifications to SQS, and an appropriate user & policy for XSIAM

mikereiddigital commented 1 month ago

Hi @dms1981. When I looked into this I could not see where the vpc flow logs output to s3 is set up so my impression was that we would need to add additional buckets & new flow logs to output to them. I was also considering the merit of keeping all of these buckets in core logging - so the same cross-account configuration that cloudtrail uses in its module - and having the accessible to the single IAM user account already set up. How this would translate into actionable terraform:

Anyhow let me know what you think. Talk to you tomorrow.

dms1981 commented 1 month ago

I've done some reading and agree with the approach of exporting the logs into our core-logging account. I think we'll want a new bucket (with attendant bucket policy & SQS), and some expansion of the existing user to access that bucket. From there, using AWS Data Firehose to stream logs into the new bucket feels like the best approach. It's referenced in the AWS docs in these places:

I did originally think that we were putting our logs into S3 but on checking that's not the case - the logs in question are sent to CloudWatch. Reconfiguring our logging to send to S3 instead of CloudWatch doesn't feel like the right approach, but streaming the logs feels architecturally correct (and is supported by the documentation I reference above).

dms1981 commented 1 month ago

I propose to take the following approach in solving this story:

  1. Create new resources (bucket, queue) in core-logging & a terraform module to stream logs
  2. Test the module & delivery of logs from core-vpc-sandbox
  3. Use the new module to stream logs across to core-logging as required by SOC team
  4. Clean up any old resources that are no longer used
dms1981 commented 2 weeks ago

This has been implemented in line with existing examples. There are some questions around behaviours seen with SQS queues and the Cortex application, but those will be handled separately to this issue.

dms1981 commented 2 weeks ago

After a conversation with @mikereiddigital this one is going into blocked until the application issues are resolved. Once the application is picking up logs successfully the old firehose implementation will be removed.