Open ara4n opened 2 months ago
Here are the logs from Synapse:
3670827:2024-07-29 14:53:13,497 - synapse.http.server - 782 - WARNING - POST-8aadc2a6593fbef6-LHR - Not sending response to request <XForwardedForRequest at 0x7f18a38b3ca0 method='POST' uri='/_matrix/client/v3/rooms/!YTvKGNlinIzlkMTVRl%3Amatrix.org/upgrade' clientproto='HTTP/1.1' site='8080'>, already disconnected.
3670828:2024-07-29 14:53:13,498 - synapse.access.http.8080 - 473 - INFO - POST-8aadc2a6593fbef6-LHR - 77.108.142.78 - 8080 - {@matthew:matrix.org} Processed request: 1854.869sec/-1674.869sec (132.897sec, 15.687sec) (90.976sec/342.120sec/42165) 0B 200! "POST /_matrix/client/v3/rooms/!YTvKGNlinIzlkMTVRl%3Amatrix.org/upgrade HTTP/1.1" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) ElementNightly/2024072901 Chrome/126.0.6478.127 Electron/31.2.1 Safari/537.36" [35554 dbevts]
Parsing the request log, we can see that the request took 1854s to complete. It looks like Synapse did not time out the request - but potentially the reverse proxy? Either way, this is exceedingly long.
It looks like this request took a long time due to copying over user's push rules and account data from this room to the other. The logs are littered with:
...
Waiting for repl stream 'account_data' to reach 3244591758 (account_data-1); currently at: 3244591756
Waiting for repl stream 'push_rules' to reach 236699064 (push_rules-1); currently at: 236699062
...
This appears to happen user-by-user sequentially over replication:
The main process handles the room upgrade request, while the push-rules-1
and account-data-1
workers handles those changes respectively.
We could shave a lot of time and resources off processing this request by just doing those two operations in larger batches than 1.
Description
Upgrading large rooms takes more than 10s, but after 10s synapse emits some kind of timeout error, leaving the client in a mess.
Steps to reproduce
Homeserver
matrix.org
Synapse Version
Whatever was live 2024-07-29T14:23:58.642Z
Installation Method
Other (please mention below)
Database
postgres
Workers
Multiple workers
Platform
debian
Configuration
No response
Relevant log output
Anything else that would be useful to know?
No response