mispy-archive / twitter_ebooks

Better twitterbots for all your friends~
MIT License
972 stars 140 forks source link

How Does This AI Work? #112

Closed amadeus closed 8 years ago

amadeus commented 8 years ago

So, I was playing around with this on a joke account with a friend who's a developer/engineer, and it generated two @ replies that where pretty bad, and were not at all seemingly contextually relevant to the user's tweet history. I've shut down the bot for now, but just as an FYI, the tweets it generated were:

"Ahh youth, when is it a crime to not forcibly assault someone"

"While I'm not transphobic, though you'd have to saw through and cut yourself on"

Any insight into how/why it may have come up with these?

negatendo commented 8 years ago

There is no AI here (at least by any modern definition, and not counting for the things others have done with their behaviors). The statement generator works on a much simpler principle: the markov chain. This is the same process that's been used to generate spam emails since the '90s, for example! https://en.wikipedia.org/wiki/Markov_chain

So it's a statistical model, not an "intelligent" one, and has no idea what the words actually MEAN in your tweets. It simply uses your tweets as a corpus to find statistical patterns in them and produce phrases that follow that pattern. This results in different words and phrases from various tweets being stitched together into something that "looks" right.

An example I like to use is a slot machine. Imagine that each "reel" in the slot machine contains parts of your tweets. After you spin the reels and they stop (chunk chunk chunk), and you will see different parts of your past tweets assembled into one big tweet. Magnetic poetry is another good metaphor.

So no, your bot is not going to say good things all the time. In fact, the statistical chances of your bot saying terrible things are very high, because "good" and "bad" and "am" and "not" pattern together so nicely. The fun comes from the occasional serendipity and timing of the bot, and perhaps some self-parody, but not any real artificial intelligence. I don't recommend twitter_ebooks bots for people who take their twitter seriously.

That's my understanding at least. Happy to hear from others.

rachelhyman commented 8 years ago

If you're interested in playing with the library more and would like to make your bot a better citizen, I would suggest reading this post by Darius Kazemi about how he used code to detect possibly-transphobic jokes and keep his bot from tweeting them.

amadeus commented 8 years ago

@negatendo thanks for the write up! And yes, it was a super lazy thread title, I did know it was not a true AI. The thing that is interesting is that I grepped through the tweet history and he's never used a term like transphobic, so I am wondering if perhaps @ replies get put together in a different way? It must have at least some library of 'filler' like words that it adds. I would hope it wouldn't use the N-word for example. Perhaps I can blacklist or remove words from a general dictionary?

@rachelhyman thanks for that link, I'll check it out!

ghost commented 8 years ago

The model only recombines existing sets of tokens; it cannot output novel words that aren't in the corpus. Not sure what's going on there.

negatendo commented 8 years ago

@amadeus Yea as @mispy said there is no other text in there aside from what comes from your tweets (and replies). Is it possible you archived 0xabad1dea's tweets, as per the example command?

Good news though: There is a stopwords.txt file you can use to exclude from the text modeling system any words you don't want.

rachelhyman commented 8 years ago

It looks like both the phrases "transphobic" and "forcibly assault" are in 0xabad1dea's tweet archive, so I think you're on the money, @negatendo.

amadeus commented 8 years ago

I'm nearly 100% positive I consumed the proper file. I just ran through the steps listed in the example readme: https://github.com/mispy/ebooks_example/blob/master/README.md

amadeus commented 8 years ago

Finally had time to look back into this, figured out the issue - I had to change some other aspects of bots.rb around to source the proper file model file. Perhaps the README in the example should reflect this? Not a big deal though, thanks again everyone though!

Just re-read the readme and realized it does mention these things, although I didn't realize it. Carry on.