cds-snc / notification-planning

Project planning for GC Notify Team
5 stars 0 forks source link

Produce AWS security findings of abnormal behaviour and pipe them to the AWS Landing Zone security hub for the SRE team to detect and respond to #726

Closed yaelberger-commits closed 10 months ago

yaelberger-commits commented 2 years ago

Description

1- Assess what we currently have in the logs and have the SRE team write queries on their end to trigger certain alarms. 2- Identify missing gaps of what we'd like to have alarms on, implement on Notify side and then SRE team create new alarms. Doing it with a soft step first (i.e #1) would align us better between the two teams, on the overall technical and business requirements. For example, it's not clear the format they expect and if we'd need to massage existing logs (and which ones to create). Can they take structured events too? on top of unstructured data, i.e. logs? etc

First, talk to SRE and trigger alarms and work our way through that. Do not follow the ideas section until after!

Ideas

Anomoly detection of metrics - Admin usage - other use cases that PAT mentioned Pipe to cloud watch and then see what works

  1. Can this detect the difference between our services?

    • If service changes its sending pattern
    • Dimensions in metrics
  2. Brainstorm ideas for metric detection:

    • API usage
    • Service limits?
    • Failed login attempts (# of times, per user name)
    • Number of times the user tries to reset their password
    • Validation errors - the wrong API key being used? Maybe send it to sentenial
  3. What are the limits? If we trigger alarms over 80%? What is 80%?

Acceptance Criteria** (Definition of done)

Pick one metric and do the below:

  1. Define metrics for the above use cases
  2. Define the limit per metric
  3. Send metric results to AWS Landing Zone

QA Steps

yaelberger-commits commented 2 years ago

Hey team! Please add your planning poker estimate with ZenHub @andrewleith @jimleroyer @jzbahrai @sastels

yaelberger-commits commented 2 years ago

Duplicate of #272 so closing 272

yaelberger-commits commented 2 years ago

@patheard Can you let us know if this is covered by the logs we send to CCCS or if we still need to do more to tackle this? Thanks

patheard commented 2 years ago

My understanding of this issue is that it would be more for our internal detection of issues, giving us a chance to fine tune the alerts on what we're interested in catching. Off the top of my head, it would be things like:

Happy to chat more and brainstorm on what abnormal behaviour looks like and how we could start detecting it.

mohdnr commented 2 years ago

+1 to what Pat mentioned. I'd start with those and then update your user story template to include a line related to:

yaelberger-commits commented 2 years ago

Added two above bullet points to user Story template as per Mohamed's suggestion