soof-golan / tix-q

Waiting rooms for all!
https://waitingroom.soofgolan.com
MIT License
5 stars 1 forks source link

`/register` will almost always get overwhelmed, and will probably crash #7

Open nukemberg opened 2 weeks ago

nukemberg commented 2 weeks ago

Since uvcorn has no default concurrency limit and since the business flow is designed such that all users will hit /registerat roughly the same time, this is a classic thundering herd problem. uvcorn will accept too many requests and will: slow down, consume lots of memory, increase GC overhead and will have increased latency, at which point users will retry in a loop. This will continue until either uvcorn dies, users give up or all users are registered - which may take a long time, much longer then if handled serially (due to retry loops).

Suggested solution:

  1. Limit uvcorn concurrency according to DB connection pool capacity, memory and CPU capacity
  2. Change the business flow such that users are not selected based on temporal order but rather on random selection. If this seems "unfair" to you, notice that users are already randomly selected based on their retry loop and the (random) load status of the server. The new flow will be:
    1. User fills the form (form will not be displayed or will be locked when room is closed)
    2. User submits the form immediately, server assigned a random waiting_position (which can be uuid, doesn't have to be a number)
    3. After waiting room is closed, /room.registrants lists users ordered by waiting_position

Removing the coordinated time based enabled of the register button will remove the initial thundering herd and subsequent retry loop. Load will be "smeared" on the entire waiting room activity period and this is also conceptually more "fair" to users who do not have to spend time clicking "register" again and again

soof-golan commented 2 weeks ago

Re:

Limit uvcorn concurrency

Excellent suggestion, will probably do that before the next event.

Re:

Change the business flow such that users are not selected based on temporal order but rather on random selection.

There are legal implications that I am not willing to undertake by randomizing the participants' order. look for out-of-band disscussions on this.