matrix-org / matrix-ircd

An IRCd implementation backed by Matrix.
Apache License 2.0
225 stars 41 forks source link

Frequent disconnects, maybe SSL related, often right after start #35

Open madduck opened 7 years ago

madduck commented 7 years ago

I've been trying now for hours (well, irssi has been…) to connect to matrix-ircd, but it doesn't seem to want to succeed.

Usually, it connects, displays the MOTD, then starts to sync all the channels/rooms (becomes unresponsive during this time), until eventually irssi says:

Irssi: warning SSL write error: Broken pipe

and disconnects.

At other times I've seen:

Irssi: warning SSL read error: server closed connection unexpectedly
Irssi: warning SSL read error: server closed connection unexpectedly
Irssi: warning SSL read error: server closed connection unexpectedly
Irssi: warning SSL read error: server closed connection unexpectedly
Irssi: warning SSL read error: server closed connection unexpectedly

There isn't really anything relevant in the debug output from matrix-ircd, except maybe

WARN Unhandled IO error, error: Received 500 response
INFO Finished

but that I see quite often.

It's not always like this, but I need to try 10–20 times for the IRC connection to actually come alive. And then it's still possible that it dies in one of those ways out of the blue.

Will provide more information as I get them, or please ask…

madduck commented 7 years ago

With 79487fe415a4287a139aa44ca1ae1387584e824d, I've just had it try 240 times before I finally gave up. Not a single successful connection attempt.

Correction: this is more related to the upgrade to Debian stretch than the new HEAD. The new HEAD on Debian jessie also sometimes closes the connection unexpectedly (which is what was running when I filed this issue). Since the upgrade to stretch though, connections are basically no longer possible, independently of #39.

madduck commented 7 years ago

Some more information: there seems to be absolutely no information in Irssi's rawlog. There's no 500 response mentioned anywhere (might have come from homeserver), and there's also nothing about the SSL read error, the connection just drops in the middle of text being exchanged. There is a lot of traffic happening before the disconnect, though, as all the autojoined rooms get sync'd. It's almost as if SSL just can't handle it and falls over. Maybe this is in the rust implementation?

madduck commented 7 years ago

This continues to be an issue. Any ideas/plans to fix this?

madduck commented 5 years ago

This has been ongoing for months, if not years. Usually, I can fix it with /rmreconns and then an explicit /connect, which yields a stable connection most of the time (not always). This connection then often stays up for weeks!

… until one day it goes down, and we re-enter the cycle.

Every time, matrix-ircd successfully authenticates with the homeserver, and a new device is created (cf. #50/#43). What then follows is the sync operation, which is known to take a long time, and which is then followed by minutes of IRC being updated with respect to channel memberships and modes.

E.g. the Irssi raw log will be filled with stuff like this:

Mar 04 05:36:49 << MODE #matrix-ircd:matrix.org
Mar 04 05:36:49 >> :localhost 324 madduck #matrix-ircd:matrix.org +n
Mar 04 05:36:49 --> chanquery mode

(I started this comment with the intention to debug this to the core, and now — OF COURSE — it's not disconnected in 12 hours. ♥ computers. I will update when the problem comes back, but now must go offline…)

madduck commented 5 years ago

I am starting to think that this might be related to the number of rooms. Other users of my matrix-ircd instance don't seem to have the same severity of problems as I do, and I have many many more rooms joined than they do.