matrix-org / pantalaimon

E2EE aware proxy daemon for matrix clients.
Apache License 2.0
290 stars 41 forks source link

/send/m.room.message times out in some encrypted rooms #75

Open ghost opened 3 years ago

ghost commented 3 years ago

Describe the bug Pantalaimon is timing out when sending messages in specific encrypted rooms.

To Reproduce No idea

Expected behavior Pantalaimon does it's job

Screenshots Error from matrix-bot-sdk

MatrixLiteClient (REQ-38150) Error: ESOCKETTIMEDOUT
    at ClientRequest.<anonymous> (/home/user/user/node_modules/request/request.js:816:19)
    at Object.onceWrapper (events.js:420:28)
    at ClientRequest.emit (events.js:314:20)
    at Socket.emitRequestTimeout (_http_client.js:769:9)
    at Object.onceWrapper (events.js:420:28)
    at Socket.emit (events.js:326:22)
    at Socket._onTimeout (net.js:482:8)
    at listOnTimeout (internal/timers.js:554:17)
    at processTimers (internal/timers.js:497:7) {
  code: 'ESOCKETTIMEDOUT',
  connect: false
}

Desktop (please complete the following information):

Additional context Only happening for one room, 301 users.

[Bot]
# omitted
SSL = True
Notifications = False
ListenAddress = localhost
ListenPort = 8080
IgnoreVerification = True
# had to disable because otherwise pantalaimon crashes when it tries to get stdin input as a system service
UseKeyring = False
ghost commented 3 years ago

no reply in three days

this is why people don't use matrix

poljar commented 3 years ago

Sharing E2EE keys with 300 users which might have multiple devices will take some time, your bot seems to give up before the keys are shared.

ghost commented 3 years ago

Encryption in that specific room worked fine for weeks, I doubt it's that. I suspect the bots session is being blocked by a user but I have no way of checking since it's a headless setup. Installing the dbus library required for panctl prevents pantalaimon from starting

poljar commented 3 years ago

E2EE keys may expire and will need to be re-shared, people delete/add device which expires E2EE keys as well, and the number of devices might have grown which might explain why it needs longer to share the keys. It might be something else, but the first thing to try will be to raise the timeout for your bot.

ghost commented 3 years ago

I can't do that. matrix-bot-sdk provides no way to increase the ttl for sending events

poljar commented 3 years ago

Can you perhaps confirm that this indeed is the issue using curl to send a message?

ghost commented 3 years ago

There isn't going to be any difference between matrix-bot-sdk sending a message to the room (MatrixClient.ts#987) and curl sending a message to the room. The issue isn't just the /send/${type}/${track} timing out either, this bug freezes up pantalaimon entirely for a few seconds

Yoric commented 3 years ago

I encountered the same issue, briefly, in a room with exactly 2 users, me and my bot.

  1. Yesterday, my bot worked nicely.
  2. Today, I connected with an element-web built and deployed by me. My bot started ESOCKETTIMEDOUT whenever I attemped to send a message.
  3. I closed the tab with this unusual element-web then reopened it.
  4. The bot resumed working correctly.