cds-snc / notification-planning-core

Project planning for GC Notify Core Team
0 stars 0 forks source link

Rework SMS pipeline for priority lanes to make it more similar to emails #2

Open yaelberger-commits opened 2 years ago

yaelberger-commits commented 2 years ago

Description

As a user of GCNotify, I want my notifications to be sent within an certain time period, So that it matches my need and expectations.

As a product owner of GCNotify, I want the notifications to be sent within the internal SLOs, So that I know my product operates within established performance parameters.

As a system op of GCNotify, I want my system to scale per the parameters of the internally SLOs, So that I have a reliable service.

WHY are we building?

WHAT are we building?

A reworked priority lane system that fits our expectations and speed up delivery of notifications based on their set priority lane.

VALUE created by our solution

Faster service matching user experience expectations.

Acceptance Criteria** (Definition of done)

QA Steps

jimleroyer commented 2 years ago

We can revisit the design as well for throttled SMS queue.

Basically, all services with an associated long code will go through >>one<< celery worker that will process the SMS throttled queue at a rate of 1 notification every 2 seconds: it applies for all dedicated long code. That means that if several services send at the same time a burst of SMS all using dedicated codes, even different, they're all share the same throttle.

jimleroyer commented 1 year ago

An idea worth to explore is to include dynamic queues in this epic: the ability to compartmentalize separate super bulk upload into separate queues would avoid slowing down other users and amortize delivery time for everyone. The support team would require less planning ahead on extra bulk send as well. Furthermore, this could also be a way to reward the usage of the scheduling feature as we'd need a way to trigger the dynamic queue feature anyway. This could kill 2 birds in 1 as we want to encourage users to use the scheduling feature (they would benefit from faster and more uniform sends) while also isolating the extra bulk send from affecting other users.

Furthermore, and this is way ahead on previous suggestion, the scheduling feature let us control time periods when users can send, effectively giving us control on time when they could actually send. If there are time periods that is busier than usual, we can use some blackout periods to not let users schedule during these, and we could also introduce a delay feature if the system is overloaded at the schedule starting time (with proper email communication to service owners).