As a Notify dev, I need to keep an eye one SMS rate limiting reached (while not being woken up in the night)
WHY are we building?
As currently configured, Notify will send SMS fragments up to a possible rate of 5500 / minute. Meanwhile, our rate limit with AWS is 3000 / minute. We recently had several celery errors) of the form An error occurred (Throttling) when calling the Publish operation (reached max retries: 4): Rate exceeded
We should surface these errors better while also ensuring that they do not cause an undue critical alert in the middle of the night
WHAT are we building?
Translate generic Celery exceptions into app specialized exceptions rather than letting them fall through to the general "celery error" (might be worth throwing again or explicitly call the retry mechanism).
Add appropriate warning and critical alarms via CloudWatch that will inform us about rate limits reached based upon this new specialized exception.
VALUE created by our solution
We will be aware of how often this specific error is happening, and hopefully it won't wake us up in the night.
Acceptance Criteria
[ ] Rate limit error caught, warning goes off.
[ ] General "celery error" warning does not go off.
Description
As a Notify dev, I need to keep an eye one SMS rate limiting reached (while not being woken up in the night)
WHY are we building?
As currently configured, Notify will send SMS fragments up to a possible rate of 5500 / minute. Meanwhile, our rate limit with AWS is 3000 / minute. We recently had several celery errors) of the form
An error occurred (Throttling) when calling the Publish operation (reached max retries: 4): Rate exceeded
We should surface these errors better while also ensuring that they do not cause an undue critical alert in the middle of the night
WHAT are we building?
VALUE created by our solution
We will be aware of how often this specific error is happening, and hopefully it won't wake us up in the night.
Acceptance Criteria
QA Steps