matrix-org / matrix-appservice-irc

Node.js IRC bridge for Matrix
Apache License 2.0
465 stars 150 forks source link

Bridge may allow bridging channels with trailing whitespace in the name #1658

Open tadzik opened 1 year ago

tadzik commented 1 year ago

We found a couple of those recently, as rooms with seemingly broken bridging. They'd show up in the database as rooms with invisible trailing whitespace:

> select * from rooms where irc_channel like '#foo-bar%';

  origin   |            room_id             |  type   |   irc_domain    | irc_channel |    irc_json     |  matrix_json  
-----------+--------------------------------+---------+-----------------+-------------+-----------------+---------------
 provision | !snip:matrix.org | channel | irc.libera.chat | #foo-bar   | {"modes":["s"]} | {"extras":{}}
 provision | !snip:matrix.org | channel | irc.libera.chat | #foo-bar  +| {"modes":["s"]} | {"extras":{}}

Further querying showed that the second row has a single character at the end, but I was not able to determine what character that actually is, only that it matches a \s in psql's regexp_ functions.

In some cases there were no duplicate rows, but the existing one had the mysterious trailing whitespace in it.

I fixed it up with update rooms set irc_channel = regexp_replace(irc_channel, '\s$', '') where irc_channel != regexp_replace(irc_channel, '\s$', '');, after deleting the broken variants of duplicate ones like #foo-bar above.

Seems like somewhere along the way the bridge doesn't do enough validation on the irc channel it's bridging, and the trailing whitespace breaks some of its operations but not others. A check for whether irc_channel == trim(irc_channel) should be performed before we insert things into the database to prevent this from happening in the future.

There's a theory that the mysterious trailing whitespace character was  , "the whitespace macOS creates when pressing the option (alt) key and the spacebar", but the DB was cleaned up before that could be verified. It'd explain why these broken bridged rooms came to be, if someone were to accidentally include this character when typing in bot commands.

progval commented 1 year ago

If the room id is :matrix.org while it's on the Libera bridge, then it looks like these are plumbed rooms rather than portal rooms, so they are likely coming from Scalar.