Closed zouppen closed 1 year ago
Some digging through the code reveals that this is actually an issue in node-irc
. It seems someone already opened a pull request at matrix-org/node-irc#30.
This seems to have been fixed along the way, probably by the node-irc changes. I'll add a testcase for it in case it resurfaces – unless you can still reproduce it with some other messages?
Sounds like fixed?
If the message contains UTF-8 characters which are multi-byte such as umlauted characters (öä) the long message splitting works incorrectly.
IRC server message limits are set in bytes and you are probably splitting by characters and it causes too long line to be sent over the wire.
Proposed fix: First encode the message to UTF-8 and then split the message. When splitting, take care not to make inter-character split (do not split if the next line contains byte starting with bits 10xxxxxx). Try to split between the words if possible. See UTF-8 specification.
To reproduce the bug, use the following Matrix message:
It results the folliowing IRC lines:
If you look carefully, in the first part there is missing
s tietää paljon
and on the second messagetodella iso. naisille sisustus on tärkeää
.