getlift / lift

Expanding Serverless Framework beyond functions using the AWS CDK
MIT License
912 stars 111 forks source link

Excessive INSUFFICIENT_DATA -> OK alarm emails from Queue construct #292

Closed tomchiverton closed 1 year ago

tomchiverton commented 1 year ago

Start from the Use-case

Our Queue do not get used 24/7, so every time a first request comes in after some period of inactivity, we get a `Subject: OK ..." email indicating

Alarm Details:
- Name:                       project-stage-functionAlarmExecutionsFailed-random
- Description:                Share step functions errors
- State Change:               INSUFFICIENT_DATA -> OK
- Reason for State Change:    Threshold Crossed: 1 datapoint [0.0 (01/02/23 08:40:00)] was not greater than the threshold (0.0).

Example Config

constructs:
  foo:
    type: queue
    encryption: 'kmsManaged'
    alarm: us@domain.com
    worker:
      handler: module.foo
      vpc: ${self:custom.vpc}

Implementation Idea

constructs:
  foo:
    type: queue    
    alarm: us@domain.com
      when_ok: false
mnapoli commented 1 year ago

Interesting 🤔

Those emails shouldn't be sent at all to be honest, I don't think we need to add an option, we should just change the way they work now. Do you have any idea on how to fix that?

tomchiverton commented 1 year ago

The only alarm I see in https://github.com/getlift/lift/blob/master/src/constructs/aws/Queue.ts isn't this one. IDK what's sending it then?

tomchiverton commented 1 year ago

I think I've tracked it down to some other component than Lift, in the end