ryanwinchester / tmi.ex

Twitch Messaging Interface for Elixir.
Apache License 2.0
44 stars 7 forks source link

Remove the special [U+e0000] "whitespace" character used by the Chatterino chat client. #11

Closed treuks closed 1 year ago

treuks commented 2 years ago

Hello! I have recently discovered your library, and it looks great, good job on that.

Unfortunately, I have noticed that your library doesn't account for the special [U+e0000] or \U000e0000 Unicode character.

And it is sorta important because there is a Twitch chat client called Chatterino, and in the communities I happen to be in, many people use it. And so, it has a feature which allows you to post the same message multiple times, by adding the special Unicode character I've mentioned above.

Due to it being considered as a different character, if you directly handle the message, the character blocks the command from executing.

I could have created a PR, but I'm not that familiar with Elixir, and don't happen to know the best way to implement it.

This should be the character itself →

I kindly ask you to consider implementing this, in case you have any more questions you can ask them right away.

Thank you for your great work!

ryanwinchester commented 2 years ago

I think you could do it yourself for now.

Adding to the README example:

defmodule MyBot do
  use TMI

  @impl TMI.Handler
  def handle_message("!" <> command, sender, chat) do
    case command do
      "dice" ->
        say(chat, Enum.random(~w(⚀ ⚁ ⚂ ⚃ ⚄ ⚅)))

      "echo " <> rest ->
        say(chat, rest)

      "dance" ->
        me(chat, "dances for #{sender}")

      _ ->
        say(chat, "unrecognized command")
    end
  end

  # Strip out the Chatterino special character and call `handle_message/3` again.
  def handle_message("\u000e0000" <> command, sender, chat) do
    handle_message(command, sender, chat)
  end

  def handle_message(message, sender, chat) do
    Logger.debug("Message in #{chat} from #{sender}: #{message}")
  end
end

If I misunderstood what the problem is let me know.

treuks commented 2 years ago

I think you didn't entirely understand me, so I'll provide you with a better reference.

If you send a message the first time, the tags should be like this.

https://paste.ivr.fi/raw/kegequfuro

If you send it a second time, then they're like this.

https://paste.ivr.fi/raw/guxawugewi

If you look carefully at the tags, then it adds a small unicode character in the middle of the message. So it would be good if you would provide me with a nice, elixir like way to strip the character(s? potentially) without much to none overhead

ryanwinchester commented 2 years ago

Can you see what this logs

defmodule MyBot do
  use TMI

  @impl TMI.Handler
  def handle_message(message, _sender, _chat, tags) do
    Logger.debug("Message: #{inspect(message)}")
    Logger.debug("Tags:\n#{inspect(tags, pretty: true)}")
  end

   def handle_message(message, _sender, _chat) do
     Logger.warning("If you see this tags are not enabled")
   end
end
treuks commented 2 years ago

Oh it actually tells me that tags are not enabled

18:51:44.370 [warning] If you see this tags are not enabled

18:51:46.204 [warning] If you see this tags are not enabled

Albeit it's weird, because I have this in my secret file

import Config

config :treuksbotv2,
  bots: [
    [
      bot: Treuksbotv2,
      user: "treuks",
      pass: "oauth:redacted",
      channels: ["treuks"],
      capabilities: ["tags", "membership", "commands"],
      debug: false
    ]
  ]
ryanwinchester commented 2 years ago

Just for fun try using charlists: ['membership', 'tags', 'commands'] (single quote instead of double).

Also what are the logs from when you connect and IRC connects and requests capabilities?

treuks commented 2 years ago

uhh i have this if it might serve useful for you


18:51:31.804 [debug] [Elixir.Treuksbotv2] UNRECOGNIZED: {:unrecognized, "421", %ExIRC.Message{args: ["treuksbot", "TAGS", "Unknown command"], cmd: "421", ctcp: false, host: [], nick: [], server: "tmi.twitch.tv", user: []}}

18:51:31.987 [debug] [Elixir.Treuksbotv2] UNRECOGNIZED: {:unrecognized, "421", %ExIRC.Message{args: ["treuksbot", "MEMBERSHIP", "Unknown command"], cmd: "421", ctcp: false, host: [], nick: [], server: "tmi.twitch.tv", user: []}}

18:51:31.987 [debug] [Elixir.Treuksbotv2] UNRECOGNIZED: {:unrecognized, "421", %ExIRC.Message{args: ["treuksbot", "COMMANDS", "Unknown command"], cmd: "421", ctcp: false, host: [], nick: [], server: "tmi.twitch.tv", user: []}}
treuks commented 2 years ago

Oh, actually, it seems like using single quotes actually made it work

treuks commented 2 years ago

Here you go

19:22:28.014 [debug] Message: "test"

19:22:28.014 [debug] Tags:
%{
  "@badge-info" => "",
  "badges" => "",
  "color" => "#9ACD32",
  "display-name" => "danks2555",
  "emotes" => "",
  "first-msg" => "0",
  "flags" => "",
  "id" => "d179a3ee-6858-47d9-b054-41dbb8cdab0c",
  "mod" => "0",
  "room-id" => "212793593",
  "subscriber" => "0",
  "tmi-sent-ts" => "1650817347829",
  "turbo" => "0",
  "user-id" => "707918309",
  "user-type" => ""
}

19:22:29.846 [debug] Message: "test 󠀀"

19:22:29.846 [debug] Tags:
%{
  "@badge-info" => "",
  "badges" => "",
  "color" => "#9ACD32",
  "display-name" => "danks2555",
  "emotes" => "",
  "first-msg" => "0",
  "flags" => "",
  "id" => "c8f4c0f5-3b1e-43c9-a214-8db827ff38e2",
  "mod" => "0",
  "room-id" => "212793593",
  "subscriber" => "0",
  "tmi-sent-ts" => "1650817349661",
  "turbo" => "0",
  "user-id" => "707918309",
  "user-type" => ""
}
ryanwinchester commented 2 years ago

It looks like it's adding your character at the end of the message?

So, for now it looks like you could use this:

String.replace_suffix(message, <<0xF3, 0xA0, 0x80, 0x80>>, "")

(UTF-8 hexadeximal from https://charbase.com/e0000-unicode-invalid-character)

ryanwinchester commented 2 years ago

Actually, it looks like you have a space and that special character at the end "test 󠀀"

Screen Shot 2022-04-25 at 1 20 32 AM

32 is the codepoint for space.

So, you might want:

String.replace_suffix(message, <<0x20, 0xF3, 0xA0, 0x80, 0x80>>, "")
Screen Shot 2022-04-25 at 1 27 33 AM
ryanwinchester commented 1 year ago

Okay, @treuks

Did you manage to strip them?

treuks commented 1 year ago

yep