Closed pedro-nonfree closed 1 year ago
OK so this is how I solved in my local testing deployment:
I had a lot of devices
(...)
after signing out of all that devices, I no longer have this problem
the web console looks similar, but does not hit that spiral of requests
so looks like for me this problem would be solved. And for the next one reaching this, I hope it does not take nearly 2h of their time such as me.
I still think that this amount of queries should never happen.
Thanks @pedro-nonfree . The number of requests looks like the initial sync. Depending on the number of rooms, messages and devices there may be some requests.
Closing this now then.
thanks @weeman1337 , I am not very familiar on how it works initial sync, but for me that looks like a bug. Specially on the huge number of requests per second
anyway, I am happy because I solved it, and I documented it so if someone faces the same maybe finds this
here is the screen capture before ending, for me, it looks too much requests and too much data requested just by a single user
it was only ~32 rooms with not much traffic on it (and the 22 devices)
according to docker, the postgresql dabase was 39M
[13:40:22] $ docker volume inspect matrix-test_matrix_db_data
[
{
"CreatedAt": "2022-12-29T02:20:08+01:00",
"Driver": "local",
"Labels": {
"com.docker.compose.project": "matrix-test",
"com.docker.compose.version": "2.14.1",
"com.docker.compose.volume": "matrix_db_data"
},
"Mountpoint": "/var/lib/docker/volumes/matrix-test_matrix_db_data/_data",
"Name": "matrix-test_matrix_db_data",
"Options": null,
"Scope": "local"
}
]
[13:40:39] $ sudo du -sh /var/lib/docker/volumes/matrix-test_matrix_db_data/_data
39M /var/lib/docker/volumes/matrix-test_matrix_db_data/_data
The number of requests looks like the initial sync. Depending on the number of rooms, messages and devices there may be some requests.
@weeman1337 an initial sync is a single request, it is the very first /sync
and has no since
parameter, only the size of that sync response varies on the factors you suggested
@pedro-nonfree
/sync
is meant to be as per the spec called immediately after the last /sync
returns, it is a long-polling request, so whenever your server sends a response, it expects clients to open a new request as a channel for another response, as an alternative to SSE.
@t3chguy I'm seeing an incredible amount of requests per second to the keys/version, keys/query, and r0/sync endpoints. It's solved by closing and reopeing the app and I noticed it after resuming my computer from sleep. The requests are returning with a 200
https://user-images.githubusercontent.com/4146037/212907838-faa6da41-2f50-47f7-a658-e8d9f3998997.mov
General note: I think the bug could come from https://github.com/matrix-org/matrix-js-sdk ; but for me is easier to report it from element-web (I don't have that much logs from the matrix-js-sdk, but I see also a lot of requests from the synapse point of view)
Steps to reproduce
docker-compose deployment looks like this way
Show
```yaml version: '3.7' services: # src https://matrix.org/docs/guides/understanding-synapse-hosting#setting-up-a-database matrix-db: container_name: matrix-db # src https://hub.docker.com/r/matrixdotorg/synapse/tags image: docker.io/postgres:14-alpine restart: unless-stopped environment: - POSTGRES_USER=${MATRIX_DB_USER} - POSTGRES_PASSWORD=${MATRIX_DB_PASSWD} - POSTGRES_INITDB_ARGS=--encoding=UTF-8 --lc-collate=C --lc-ctype=C volumes: - matrix_db_data:/var/lib/postgresql/data # src https://matrix.org/docs/guides/understanding-synapse-hosting#setting-up-a-database matrix: container_name: matrix # src https://hub.docker.com/r/matrixdotorg/synapse/tags image: docker.io/matrixdotorg/synapse:v1.74.0 restart: unless-stopped #entrypoint: sleep infinity volumes: - matrix_data:/data # src https://hub.docker.com/r/matrixdotorg/synapse environment: - SYNAPSE_CONFIG_PATH=/data/homeserver.yaml # src https://github.com/matrix-org/synapse/issues/8304#issuecomment-989218044 depends_on: - matrix-db ports: - 8008:8008 volumes: matrix_data: matrix_db_data: ```.env file
configured postgres this way
Show
```sh gen_config() { docker_path='/var/lib/docker' prefix="$(basename $(pwd))" volname="${prefix}_matrix_data" # generate config docker run -it --rm --mount type=volume,src=${volname},dst=/data -e SYNAPSE_SERVER_NAME=localhost -e SYNAPSE_REPORT_STATS=no matrixdotorg/synapse:v1.74.0 generate } use_psql_config() { # substitute sqlite block (all database:) with postgresql suggested config start_block='database:' end_block='log_config:' block_raw="$(cat <run admin user
then avoid rate limiting the admin user
Show
```sh homeserver='http://localhost:8008' user='@admin:localhost' user_name='admin' user_passwd='admin' get_access_token() { curl -s -XPOST -d "${login_data}" "${login_query}" | jq -r '.access_token' } access_token="$(get_access_token)" avoid_rate_limiting() { login_data='{"type":"m.login.password", "user":"'"${user_name}"'", "password":"'"${user_passwd}"'"}' login_query="${homeserver}/_matrix/client/r0/login" override_data='{ "messages_per_second": 0, "burst_count": 0}' override_query="${homeserver}/_synapse/admin/v1/users/${user}/override_ratelimit" # the post already does a get somehow #curl -H "authorization: Bearer ${access_token}" "${override_query}" curl -XPOST -H "authorization: Bearer ${access_token}" -d "${override_data}" "${override_query}" } avoid_rate_limiting ```loaded app.element.io client for localhost
Outcome
What did you expect?
system idle and ready to wait for requests
What happened instead?
Occurs a DDoS (the client sends too much requests to the homeserver on an API that according to the spec it should be rate limited but it's not), hence, the system performance degrades quickly when that users adds more and more devices joining matrix and sending fake data. Because some bug is triggered; the system becomes inoperable at ~30 MB of data with ~20 devices of user admin and ~30 rooms (this is the size of the postgresql volume)
after some point after a page refresh, this triggers A LOT of requests and makes my machine very slow
I have a screen recording of 5.7 MiB that I can share to you, I cannot upload here to github, but here is an screen captured image:
the version in red refers to this query (this query is tried a lot)
related
GET /_matrix/client/v3/room_keys/version
in the matrix spec https://spec.matrix.org/latest/client-server-api/#get_matrixclientv3room_keysversion says that this api entrypoint is rate limited; and I configured this matrix instance to be non rate limited (see how I did it above, in the steps to reproduce); because this account is for a bot, so I don't want to have rate limit on posting messagesand the response, as shown by the screen captured image
and here, in the
docker-compose
output for synapse, you can see the requests, it is putting my machine at 100% of use; the machine running this experiment is old; but it processes a lot queries per second; this is insane!I looked for a solution and I have found this issues, they are not about the same problem:
according to the web console, this is when it starts the DDoS
Operating system
Debian GNU/Linux 11 (bullseye)
Browser information
Version 108.0.5359.124 (Official Build) built on Debian 11.5, running on Debian 11.6 (64-bit)
URL for webapp
https://app.element.io
Application version
Element version: 1.11.17 Olm version: 3.2.12
Homeserver
http://localhost:8008
Will you send logs?
Yes