learn-video / mosaic-video

Generate mosaics from video inputs
Apache License 2.0
35 stars 6 forks source link

Distribute workload #33

Closed mauricioabreu closed 8 months ago

mauricioabreu commented 8 months ago

Is your feature request related to a problem? Please describe. Currently, the program is targeted to run on a single instance. We need high availability, and a way to scale it horizontally.

Describe the solution you'd like Every new worker should process number of mosaics / number of nodes mosaics. It grants an even distribution of the workload.

We also need to handle topics like:

Suggested approach 1:

Whenever a worker initiates, it should insert or update a key in Redis under the workers namespace. We could use the hostname or a predefined ID for each worker. This is crucial for workload distribution. For instance, if two workers are registered and there are 10 mosaics to be processed, each worker would handle number of mosaics / number of nodes, as initially outlined

register(1)

Maybe we can expose some metrics of which mosaics are being processed by each worker.

When a worker dies (or is intentionally terminated), the node must be unregistered, either through TTL or graceful shutdown. Subsequently, other workers should take over the tasks that were being processed by the now unavailable worker.

When a new worker joins the cluster, it is necessary to redistribute the workload. For instance, if there are 2 workers processing a total of 12 tasks, each worker handles 6 tasks. If a new worker joins, we need to terminate 2 tasks (possibly selected at random) from each of the currently running workers. The new worker will then take on 4 of these tasks, ensuring that each worker is now processing 4 tasks.

Suggested approach 2:

We can use a framework like asynq and use a Redis as a queue.

Some things like healthchecks and workload distribution will still happen but in a lower degree.

Describe alternatives you've considered No changes needed.

Additional context We already use a strategy to lock the mosaic, but it needs to have the node ID in it.

mauricioabreu commented 8 months ago

Closed via #35