matrix-org / matrix-appservice-irc

Node.js IRC bridge for Matrix
Apache License 2.0
460 stars 149 forks source link

localparts for virtual Matrix users contain capital letters #1780

Open kbleeke opened 9 months ago

kbleeke commented 9 months ago

Matrix User Identifiers are supposed to only contain lower-case letters

https://spec.matrix.org/v1.8/appendices/#user-identifiers

However, the bridge creates Localparts with capital letters, if the IRC nick contains capital letters

https://github.com/matrix-org/matrix-appservice-irc/blob/develop/src/irc/IrcServer.ts#L535

Synapse seems to not care and just accept these User IDs.

Conduit, however only expects lower-case IDs and then bridging for IRC users with capital letters in their names fails.

see also https://gitlab.com/famedly/conduit/-/issues/388

diff --git a/src/irc/IrcServer.ts b/src/irc/IrcServer.ts
index 7bcbef57..c9cce65e 100644
--- a/src/irc/IrcServer.ts
+++ b/src/irc/IrcServer.ts
@@ -548,7 +548,7 @@ export class IrcServer {
         return renderTemplate(this.config.matrixClients.userTemplate, {
             server: this.domain,
             nick,
-        }).substring(1); // the first character is guaranteed by config schema to be '@'
+        }).substring(1).toLowerCase(); // the first character is guaranteed by config schema to be '@'
     }

     public claimsUserId(userId: string): boolean {
@@ -586,9 +586,7 @@ export class IrcServer {
     }

     public getUserIdFromNick(nick: string): string {
-        const template = this.config.matrixClients.userTemplate;
-        return template.replace(/\$NICK/g, nick).replace(/\$SERVER/g, this.domain) +
-            ":" + this.homeserverDomain;
+        return "@" + this.getUserLocalpart(nick) + ":" + this.homeserverDomain;
     }

     public getDisplayNameFromNick(nick: string): string {

Lowercasing all Localparts seems to work for me but I don't know about the consequences for other homeservers or existing deployments. Maybe this should be a config option

erAck commented 9 months ago

https://spec.matrix.org/v1.8/appendices/#historical-user-ids says

In order to handle these rooms successfully, clients and servers MUST accept user IDs with localparts from the expanded character set:

extended_user_id_char = %x21-39 / %x3B-7E ; all ASCII printing chars except :

so that looks rather like a Conduit shortcoming.

kbleeke commented 9 months ago

(I'm not affiliated with conduit just to be clear)

Neither my room, nor the userIDs are "historical", they were created yesterday.

If anything, the next section even explains what bridges such as this one are supposed to do

ufm commented 6 months ago

The same issue with the Dendrit server. The problem is that during registration, the server returns a user_id that needs to be used subsequently. Moreover, according to https://spec.matrix.org/v1.8/appendices/#user-identifiers, a user_id can only contain lowercase letters (from v1.8). This means that the bridge is in direct violation of the standard v1.8.