matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.79k stars 2.13k forks source link

Gathering performance data of synapse deployments #4659

Open targodan opened 5 years ago

targodan commented 5 years ago

Description:

It would be nice if we could gather some performance information about synapse deployments.

Short-Term:

Use a google doc or something similar to allow admins that are willing to contribute the data of their deployments. This data is mostly subjective and depends on a lot of unknowns (like how many users are in which federated channels) but it might still give people who are thinking about setting up a HS a starting point for what to expect. I am hoping these unknowns will, at least somewhat, average out with enough contributions.

Also new admins might not even know how many users they can expect to use federated channels, so having these unknowns in the averages might even be a good thing.

I have drafted up a quick google doc to show you what I mean. It is public so any synapse admin can already contribute. Just keep in mind that this document might be changed, hosted somewhere else or just flat out discontinued entirely at any point. https://docs.google.com/spreadsheets/d/1hOOh_GK3cvzVFprfP_rWwhNe8ChBqFfJBqIz-36Hhac/edit?usp=sharing

The measurement type "Official Script" is meant for some sort of official script that may gather performance information over say 24 hours or so and average the data as well as detect peaks and so on. I think especially interesting is the column "CPU-Time per Hour" which is almost impossible to do by hand.

Long-Term:

Maybe add a diagnostics API to synapse that can be used to make the aforementioned unknowns known in order to give a more accurate performance reading.

Maybe even some opt-in anonymous analytics that would be transmitted to a matrix.org hosted server.

Long-long-Term:

Maybe even add this API to the matrix specs so people can properly compare different HS implementations.

realitygaps commented 5 years ago

This could be useful, as people could contribute the statistics they are comfortable with sharing and could also be done without identifying directly the homeserver.

Would be especially nice if it wasnt on google docs but maybe used an ethercalc or similar instead so that google accounts are not required to add/edit it, which is a blocker for some to contribute.

targodan commented 5 years ago

If it's just logging in you're worried about, that's not necessary. You can view and edit anonymously without logging in. I can understand though if people are uncomfortable with this being hosted on google servers.

neilisfragile commented 5 years ago

Admins can already opt-in to sending some performance stats centrally which covers some of this data. When we analysed it with a view to improving perf for smaller instances, we found it to be extremely noisy to the point where it was hard to use it in an actionable way.

We can't share the data publicly in a raw form as is simply because we don't have the permission to do so - though had the data been more instructive we could have shared the high level findings.

We will soon have a project to revisit this use case, anything we find to help with running smaller installations will be shared back with the community - I'd like us to be able to share as much as possible about the findings without breaking trust on the data.

That said, if there was a community effort to collect this sort of data, it would obviously be useful.