RITlug / teleirc

Go implementation of a Telegram <=> IRC bridge for use with any IRC channel and Telegram group
https://docs.teleirc.com/
GNU General Public License v3.0
138 stars 46 forks source link

Escape IRC messages as plaintext (i.e. no Markdown formatting) in Telegram #142

Closed jwflory closed 5 years ago

jwflory commented 5 years ago

Summary

IRC messages sent to Telegram always render as plaintext (i.e. no Markdown formatting rules applied)

Background

When hyperlinks including underscore characters are sent across the bridge, Telegram has suddenly started rendering this as italics:

https://meetbot.fedoraproject.org/fedora-meeting-1/2019-04-25/f30-readiness-meeting.2019-04-25-19.00.log.html

Rendered message in Telegram when URL is sent across TeleIRC

Obviously URLs should render as plain-text only.

Details

This needs investigation. We need to do some message pre-processing on received IRC messages before sending them over to IRC. I think this is an unintended side effect of #134 when we changed parse_mode:

https://github.com/RITlug/teleirc/blob/b677658f32296e629e81867dfb814b4f4d308695/lib/TeleIrc.js#L335

Outcome

URLs successfully appear as written when sent across bridge

Tjzabel commented 5 years ago

Discussion

I'm always hesitant to add more pre-message processing as this adds more overhead to TeleIRC before messages are sent across the bridge.

At the moment, it seems to me that we have several courses of actions with this issue:

  1. Check to see if a message contains http or https in it, and turn off markdown within that section if so
  2. Revert the markdown entirely

Option 1

Course 1 involves parsing every IRC message for a contained string (Does JS have this natively?), and reverting only the web link back to plaintext. This is most likely possible, but would nonetheless add some more overhead to each message. If this is something we are willing to do, then I don't have any issues going forth with this.

Option 2

Course 2 simply reverts all the markdown, getting rid of the IRC bolding. I've gotten quite fond of the IRC bolding Telegram side, as it makes it easy to see when folks are messaging from IRC. This route would need to weigh the pros/cons of having correct links versus having the bold formatting. This is the easiest option.

Tjzabel commented 5 years ago

I created #143 since links are so important. This would go towards Option 2 as a short-term action, while we work on planning out how Option 1 would work.

xforever1313 commented 5 years ago

The line that introduces the markdown to bold the username is inside of IrcMessageHandler.js:

let message = "<*" + username + "*> " + userMessage;

We're only using markdown to bold the username, and nowhere else (to my knowledge). Could we just escape the user's message so any markdown get escaped? Something like:

let message = "<*" + username + "*> " + Markdown.Escape( userMessage );

Given Node's lackluster standard library, I doubt such function exists without pulling a dependency or rolling our own.

Tjzabel commented 5 years ago

Hm, that could very much be a good solution. Thanks for looking into this!

jwflory commented 5 years ago

+1 for @xforever1313's solution. We don't need to parse the message content as Markdown, only the username.