zephyrtronium / robot

Robot is a Twitch bot that learns from people and responds to them with things that it has learned.
GNU General Public License v3.0
32 stars 5 forks source link
irc-bot markov-chain twitch-bot twitch-tv

Robot

Robot is a bot for Twitch.TV IRC that learns from people and responds to them with things that it has learned.

Tools for broadcasters and mods

Robot has a number of feaatures for managing activity level and knowledge. Generally speaking, the bot is designed to treat chat moderation actions as moderating its knowledge as well. This includes:

In addition to the above, Robot provides explicit moderation commands.

What data does Robot store?

Robot learns from most messages in the Twitch chats she's in while the stream is online. In order to ensure moderators are able to do their job, she stores additional metadata on those messages as well. Here is the complete list of information types that go into Robot's databases:

In the message metadata, the message sender is stored using a cryptographic hash of the sender's user ID, the channel it was sent to, and the fifteen-minute time period in which it was sent. Roughly speaking, if Robot has been learning from Bocchi, message metadata together with Markov chain tuples can answer questions like these:

On the other hand, it is infeasible or at least very expensive for Robot's data to answer questions like these:

If you want Robot not to record your messages for any reason, simply use the give me privacy command. You'll still be able to ask Robot for messages and use other commands. If you'd like the bot to learn from you again after going private, use the learn from me again command.

How Robot works

Robot uses the mathematical concept of Markov chains, extended in some interesting ways, to learn from chat and apply its knowledge. Here's an example.

Let's say Robot sees this chat message: Bocchi the Rock!. The first thing it will do is run some preliminary checks to make sure it's ok to learn from the message, e.g. no links, sender hasn't opted out, &c.

This particular message is fine. Robot's next step is to break it up into a list of tokens – basically words or stretches of non-letter characters followed by spaces. The tokens here are <beginning of message>, Bocchi, the, Rock, !, <end of message>. The "beginning of message" and "end of message" are invisible tokens that are always there, at least conceptually.

For each token, Robot now learns that all of the ones before it, as a group, can be followed by the one after. The prefix is made lowercase for this to help improve variety later. That is to say:

Learning the message is finished. But robots don't like learning things they'll never use.

When it's time for Robot to think of something to say, the bot does a random walk on everything it's learned. Starting with the invisible beginning-of-message token, the bot picks out everything it has learned can follow and picks one option at random.

Let's say it picks the word You. Robot records that the random walk went to You, then looks for everything that can follow <beginning of message> you (converted to lowercase, as during learning). It might pick SHOULD next; record it and look from <beginning of message> you should, and maybe choose HAVE; then waited.

To make generated messages more interesting, Robot can also shorten the length of the context it's using to search when there are few options. Let's say after waited it starts applying this technique. Instead of looking for <beginning of message> you should have waited, it drops the beginning token and tries again for messages that contained "you should have waited" anywhere, rather than only at the start. This might still not help much, so it does it again, and we'll say once more, so that now it's looking for <beginning of message> you should have waited.

Now it finds so as the next token. Along with adding it to the random walk, it restores one of the tokens it dropped from the random walk, in case that will match something else it's learned. So the next search happens with should have waited so. Next it picks long, followed by ! and <end of message>. So, the generated message is You SHOULD HAVE waited so long!.

Commands

Robot understands commands to be messages which start or end with the bot's username, ignoring case, possibly preceded by an @ character and possibly followed by punctuation when at the start. For example, if the bot's username is "Robot", then it will recognize these as commands:

These are not recognized as commands:

Commands for everyone

Commands for moderators

Effects

Robot sometimes applies special effects to copypasta and randomly generated messages. Effects can be configured per channel. The possible effects are: