matrix-org / matrix-appservice-irc

Node.js IRC bridge for Matrix
Apache License 2.0
460 stars 149 forks source link

Matrx to IRC messages not getting through #1768

Closed stefanor closed 11 months ago

stefanor commented 11 months ago

Apologies for a bad bug report, I need some help to understand what's going on.

Since upgrading matrix.debian.social's OFTC bridge to 1.0.1 (from 0.35), we've clearly hit some bug, the bridge has just not been stable. When we upgraded to 1.0.1, we also had to upgrade nodejs, and switch our IRC connections to a different hostname, to get SSL verification working again. So there are a few changes that happened together.

Today's issue:

After the last startup (with 1.0.1) the bridge initially worked, for me, delivering messages to and from IRC. But a few days later, my bridge IRC client had fallen out of some IRC channels and isn't getting any messages from the matrix side. The bot wasn't responding in the admin room, at all, either.

I restarted the bot. With 500 users, restarts are slow (a couple of hours for all the users to rejoin). That got me back in all my rooms, but my messages still weren't getting through.

If I POST via the debug API, messages get to IRC, as me. So clearly there is something broken between Synapse and the bot.

Any hints on further debugging?

stefanor commented 11 months ago

OK, some more poking around took me to https://github.com/matrix-org/matrix-appservice-node/pull/63. I think our synapse is too old for the new matrix-appservice-irc. I backed out that commit, and now messages are flowing.

Not sure how anything worked since the upgrade, without that. I'll keep monitoring...

Half-Shot commented 11 months ago

Erp, it should have warned you on startup that communication between HS<->AS isn't working :(

With 500 users, restarts are slow (a couple of hours for all the users to rejoin). That got me back in all my rooms, but my messages still weren't getting through.

You might want to adjust the flood delay value in the config, that sounds quite extreme.

stefanor commented 11 months ago

You might want to adjust the flood delay value in the config, that sounds quite extreme.

Ah, is that the critical timer here? thanks!

stefanor commented 11 months ago

Erp, it should have warned you on startup that communication between HS<->AS isn't working :(

Ah, yes, it did. Lost in a sea of errors, but it was there.

Homeserver cannot reach the bridge.

You might want to adjust the flood delay value in the config, that sounds quite extreme.

Ah, is that the critical timer here? thanks!