element-hq / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://element-hq.github.io/synapse
GNU Affero General Public License v3.0
1.24k stars 153 forks source link

Full high-availability (Redis Cluster/Sentinel support) #16984

Open Stogas opened 6 months ago

Stogas commented 6 months ago

Description:

Currently (as of v1.102.0) Synapse supports horizontal scaling capability via workers. As I understand, the current worker capabilities result in somewhat complete request processing independence from the main worker process (for which we cannot run multiple processes), so at first glance, Synapse is highly-available.

However, using workers require the use of Redis. Synapse (again, as of v1.102.0) only supports a single Redis hostname and port. It does not support Redis Sentinel, which would handle Redis master (write-capable) election and redirection.

I've found a PR in the old matrix-org repo for adding Redis Sentinel support, but I have no ability to maintain it - so adding an issue ticket here instead.

Additionally, I think it might be worthwhile to add relevant guidelines in the Synapse documentation for information on how to achieve a highly-available setup. The 2020 post about scaling Synapse seems to indicate multiple Redis instances in their diagrams, but I can't seem to figure out how to achieve this with Synapse, as supporting Redis Cluster requires the redis client to be cluster-aware.

Related issues:

benbenik commented 4 months ago

Good you've opened this issue @Stogas! I've been comparing solutions. Element/Matrix checks all boxes, but one: High Availability. Single Point of Failure is not an option: "one == none". Clients should continue to work if a server goes down, without a client/app being interrupted. HA is a key feature to making the infrastructure secure (availability) and experience robust. Looking forward to see progress in this area. Keep up the good work!

dklimpel commented 4 months ago

Regarding the Redis Cluster/Sentinel support. Is redis still the right database to use in the long term or is there a move to the fork of the linux foundation (Valkey)?