element-hq / element-web

A glossy Matrix collaboration client for the web.
https://element.io
GNU Affero General Public License v3.0
11.02k stars 1.96k forks source link

element-web client DDoSing my localhost matrix homeserver #24117

Closed pedro-nonfree closed 1 year ago

pedro-nonfree commented 1 year ago

General note: I think the bug could come from https://github.com/matrix-org/matrix-js-sdk ; but for me is easier to report it from element-web (I don't have that much logs from the matrix-js-sdk, but I see also a lot of requests from the synapse point of view)

Steps to reproduce

docker-compose deployment looks like this way

Show ```yaml version: '3.7' services: # src https://matrix.org/docs/guides/understanding-synapse-hosting#setting-up-a-database matrix-db: container_name: matrix-db # src https://hub.docker.com/r/matrixdotorg/synapse/tags image: docker.io/postgres:14-alpine restart: unless-stopped environment: - POSTGRES_USER=${MATRIX_DB_USER} - POSTGRES_PASSWORD=${MATRIX_DB_PASSWD} - POSTGRES_INITDB_ARGS=--encoding=UTF-8 --lc-collate=C --lc-ctype=C volumes: - matrix_db_data:/var/lib/postgresql/data # src https://matrix.org/docs/guides/understanding-synapse-hosting#setting-up-a-database matrix: container_name: matrix # src https://hub.docker.com/r/matrixdotorg/synapse/tags image: docker.io/matrixdotorg/synapse:v1.74.0 restart: unless-stopped #entrypoint: sleep infinity volumes: - matrix_data:/data # src https://hub.docker.com/r/matrixdotorg/synapse environment: - SYNAPSE_CONFIG_PATH=/data/homeserver.yaml # src https://github.com/matrix-org/synapse/issues/8304#issuecomment-989218044 depends_on: - matrix-db ports: - 8008:8008 volumes: matrix_data: matrix_db_data: ```

.env file

MATRIX_DB_USER='synapse'
MATRIX_DB_PASSWD='synapse'

configured postgres this way

Show ```sh gen_config() { docker_path='/var/lib/docker' prefix="$(basename $(pwd))" volname="${prefix}_matrix_data" # generate config docker run -it --rm --mount type=volume,src=${volname},dst=/data -e SYNAPSE_SERVER_NAME=localhost -e SYNAPSE_REPORT_STATS=no matrixdotorg/synapse:v1.74.0 generate } use_psql_config() { # substitute sqlite block (all database:) with postgresql suggested config start_block='database:' end_block='log_config:' block_raw="$(cat <

run admin user

register_new_matrix_user -u admin -p admin -c /data/homeserver.yaml -a

then avoid rate limiting the admin user

Show ```sh homeserver='http://localhost:8008' user='@admin:localhost' user_name='admin' user_passwd='admin' get_access_token() { curl -s -XPOST -d "${login_data}" "${login_query}" | jq -r '.access_token' } access_token="$(get_access_token)" avoid_rate_limiting() { login_data='{"type":"m.login.password", "user":"'"${user_name}"'", "password":"'"${user_passwd}"'"}' login_query="${homeserver}/_matrix/client/r0/login" override_data='{ "messages_per_second": 0, "burst_count": 0}' override_query="${homeserver}/_synapse/admin/v1/users/${user}/override_ratelimit" # the post already does a get somehow #curl -H "authorization: Bearer ${access_token}" "${override_query}" curl -XPOST -H "authorization: Bearer ${access_token}" -d "${override_data}" "${override_query}" } avoid_rate_limiting ```

loaded app.element.io client for localhost

image

Outcome

What did you expect?

system idle and ready to wait for requests

What happened instead?

Occurs a DDoS (the client sends too much requests to the homeserver on an API that according to the spec it should be rate limited but it's not), hence, the system performance degrades quickly when that users adds more and more devices joining matrix and sending fake data. Because some bug is triggered; the system becomes inoperable at ~30 MB of data with ~20 devices of user admin and ~30 rooms (this is the size of the postgresql volume)

after some point after a page refresh, this triggers A LOT of requests and makes my machine very slow

I have a screen recording of 5.7 MiB that I can share to you, I cannot upload here to github, but here is an screen captured image:

image

the version in red refers to this query (this query is tried a lot)

Request URL: http://localhost:8008/_matrix/client/v3/room_keys/version
Request Method: GET
Status Code: 404 Not Found
Remote Address: [::1]:8008
Referrer Policy: no-referrer

related GET /_matrix/client/v3/room_keys/version in the matrix spec https://spec.matrix.org/latest/client-server-api/#get_matrixclientv3room_keysversion says that this api entrypoint is rate limited; and I configured this matrix instance to be non rate limited (see how I did it above, in the steps to reproduce); because this account is for a bot, so I don't want to have rate limit on posting messages

and the response, as shown by the screen captured image

{"errcode":"M_NOT_FOUND","error":"No backup found"}

and here, in the docker-compose output for synapse, you can see the requests, it is putting my machine at 100% of use; the machine running this experiment is old; but it processes a lot queries per second; this is insane!

matrix     | 2022-12-29 00:24:13,087 - synapse.access.http.8008 - 460 - INFO - GET-7481 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.021sec/0.001sec (0.003sec, 0.001sec) (0.002sec/0.009sec/6) 279B 200 "GET /_matrix/client/r0/sync?filter=0&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" [0 dbevts]
matrix     | 2022-12-29 00:24:13,102 - synapse.access.http.8008 - 460 - INFO - GET-7482 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.019sec/0.001sec (0.004sec, 0.000sec) (0.004sec/0.008sec/6) 240B 200 "GET /_matrix/client/r0/sync?filter=1&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)" [0 dbevts]
matrix     | 2022-12-29 00:24:13,132 - synapse.access.http.8008 - 460 - INFO - GET-7484 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.022sec/0.001sec (0.008sec, 0.000sec) (0.002sec/0.012sec/6) 240B 200 "GET /_matrix/client/r0/sync?filter=1&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)" [0 dbevts]
matrix     | 2022-12-29 00:24:13,135 - synapse.access.http.8008 - 460 - INFO - GET-7485 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.018sec/0.001sec (0.007sec, 0.000sec) (0.002sec/0.008sec/6) 279B 200 "GET /_matrix/client/r0/sync?filter=0&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" [0 dbevts]
matrix     | 2022-12-29 00:24:13,160 - synapse.access.http.8008 - 460 - INFO - GET-7486 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.022sec/0.001sec (0.007sec, 0.000sec) (0.002sec/0.013sec/6) 240B 200 "GET /_matrix/client/r0/sync?filter=1&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)" [0 dbevts]
matrix     | 2022-12-29 00:24:13,186 - synapse.access.http.8008 - 460 - INFO - GET-7487 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.021sec/0.001sec (0.007sec, 0.000sec) (0.002sec/0.011sec/6) 240B 200 "GET /_matrix/client/r0/sync?filter=1&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)" [0 dbevts]
matrix     | 2022-12-29 00:24:13,211 - synapse.access.http.8008 - 460 - INFO - GET-7489 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.030sec/0.002sec (0.006sec, 0.001sec) (0.005sec/0.016sec/6) 279B 200 "GET /_matrix/client/r0/sync?filter=0&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" [0 dbevts]
matrix     | 2022-12-29 00:24:13,217 - synapse.access.http.8008 - 460 - INFO - GET-7490 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.025sec/0.001sec (0.006sec, 0.002sec) (0.003sec/0.012sec/6) 240B 200 "GET /_matrix/client/r0/sync?filter=1&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)" [0 dbevts]
matrix     | 2022-12-29 00:24:13,237 - synapse.access.http.8008 - 460 - INFO - GET-7491 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.015sec/0.001sec (0.003sec, 0.001sec) (0.002sec/0.006sec/6) 240B 200 "GET /_matrix/client/r0/sync?filter=1&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)" [0 dbevts]
matrix     | 2022-12-29 00:24:13,264 - synapse.access.http.8008 - 460 - INFO - GET-7492 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.022sec/0.001sec (0.007sec, 0.001sec) (0.003sec/0.010sec/6) 240B 200 "GET /_matrix/client/r0/sync?filter=1&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)" [0 dbevts]
matrix     | 2022-12-29 00:24:13,265 - synapse.access.http.8008 - 460 - INFO - POST-7494 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.010sec/0.002sec (0.002sec, 0.000sec) (0.001sec/0.004sec/3) 1804B 200 "POST /_matrix/client/r0/keys/query HTTP/1.1" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" [0 dbevts]
matrix     | 2022-12-29 00:24:13,292 - synapse.access.http.8008 - 460 - INFO - GET-7495 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.019sec/0.004sec (0.009sec, 0.002sec) (0.002sec/0.007sec/6) 240B 200 "GET /_matrix/client/r0/sync?filter=1&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)" [0 dbevts]
matrix     | 2022-12-29 00:24:13,319 - synapse.access.http.8008 - 460 - INFO - GET-7497 - ::ffff:172.24.0.1 - 8008 - {@admin:localhost} Processed request: 0.028sec/0.001sec (0.003sec, 0.002sec) (0.004sec/0.013sec/6) 279B 200 "GET /_matrix/client/r0/sync?filter=0&timeout=30000&since=s6721_2122_0_1_100_1_1_29_0 HTTP/1.1" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" [0 dbevts]

I looked for a solution and I have found this issues, they are not about the same problem:

according to the web console, this is when it starts the DDoS

image

Operating system

Debian GNU/Linux 11 (bullseye)

Browser information

Version 108.0.5359.124 (Official Build) built on Debian 11.5, running on Debian 11.6 (64-bit)

URL for webapp

https://app.element.io

Application version

Element version: 1.11.17 Olm version: 3.2.12

Homeserver

http://localhost:8008

Will you send logs?

Yes

pedro-nonfree commented 1 year ago

OK so this is how I solved in my local testing deployment:

I had a lot of devices

image

(...)

image

after signing out of all that devices, I no longer have this problem

the web console looks similar, but does not hit that spiral of requests

image

so looks like for me this problem would be solved. And for the next one reaching this, I hope it does not take nearly 2h of their time such as me.

I still think that this amount of queries should never happen.

weeman1337 commented 1 year ago

Thanks @pedro-nonfree . The number of requests looks like the initial sync. Depending on the number of rooms, messages and devices there may be some requests.

Closing this now then.

pedro-nonfree commented 1 year ago

thanks @weeman1337 , I am not very familiar on how it works initial sync, but for me that looks like a bug. Specially on the huge number of requests per second

anyway, I am happy because I solved it, and I documented it so if someone faces the same maybe finds this

here is the screen capture before ending, for me, it looks too much requests and too much data requested just by a single user

image

it was only ~32 rooms with not much traffic on it (and the 22 devices)

according to docker, the postgresql dabase was 39M

[13:40:22] $ docker volume inspect matrix-test_matrix_db_data
[
    {
        "CreatedAt": "2022-12-29T02:20:08+01:00",
        "Driver": "local",
        "Labels": {
            "com.docker.compose.project": "matrix-test",
            "com.docker.compose.version": "2.14.1",
            "com.docker.compose.volume": "matrix_db_data"
        },
        "Mountpoint": "/var/lib/docker/volumes/matrix-test_matrix_db_data/_data",
        "Name": "matrix-test_matrix_db_data",
        "Options": null,
        "Scope": "local"
    }
]
[13:40:39] $ sudo du -sh /var/lib/docker/volumes/matrix-test_matrix_db_data/_data
39M /var/lib/docker/volumes/matrix-test_matrix_db_data/_data
t3chguy commented 1 year ago

The number of requests looks like the initial sync. Depending on the number of rooms, messages and devices there may be some requests.

@weeman1337 an initial sync is a single request, it is the very first /sync and has no since parameter, only the size of that sync response varies on the factors you suggested

@pedro-nonfree

/sync is meant to be as per the spec called immediately after the last /sync returns, it is a long-polling request, so whenever your server sends a response, it expects clients to open a new request as a channel for another response, as an alternative to SSE.

MagsMagnoli commented 1 year ago

@t3chguy I'm seeing an incredible amount of requests per second to the keys/version, keys/query, and r0/sync endpoints. It's solved by closing and reopeing the app and I noticed it after resuming my computer from sleep. The requests are returning with a 200

https://user-images.githubusercontent.com/4146037/212907838-faa6da41-2f50-47f7-a658-e8d9f3998997.mov