Closed amadeus closed 8 years ago
There is no AI here (at least by any modern definition, and not counting for the things others have done with their behaviors). The statement generator works on a much simpler principle: the markov chain. This is the same process that's been used to generate spam emails since the '90s, for example! https://en.wikipedia.org/wiki/Markov_chain
So it's a statistical model, not an "intelligent" one, and has no idea what the words actually MEAN in your tweets. It simply uses your tweets as a corpus to find statistical patterns in them and produce phrases that follow that pattern. This results in different words and phrases from various tweets being stitched together into something that "looks" right.
An example I like to use is a slot machine. Imagine that each "reel" in the slot machine contains parts of your tweets. After you spin the reels and they stop (chunk chunk chunk), and you will see different parts of your past tweets assembled into one big tweet. Magnetic poetry is another good metaphor.
So no, your bot is not going to say good things all the time. In fact, the statistical chances of your bot saying terrible things are very high, because "good" and "bad" and "am" and "not" pattern together so nicely. The fun comes from the occasional serendipity and timing of the bot, and perhaps some self-parody, but not any real artificial intelligence. I don't recommend twitter_ebooks bots for people who take their twitter seriously.
That's my understanding at least. Happy to hear from others.
If you're interested in playing with the library more and would like to make your bot a better citizen, I would suggest reading this post by Darius Kazemi about how he used code to detect possibly-transphobic jokes and keep his bot from tweeting them.
@negatendo thanks for the write up! And yes, it was a super lazy thread title, I did know it was not a true AI. The thing that is interesting is that I grepped through the tweet history and he's never used a term like transphobic, so I am wondering if perhaps @ replies get put together in a different way? It must have at least some library of 'filler' like words that it adds. I would hope it wouldn't use the N-word for example. Perhaps I can blacklist or remove words from a general dictionary?
@rachelhyman thanks for that link, I'll check it out!
The model only recombines existing sets of tokens; it cannot output novel words that aren't in the corpus. Not sure what's going on there.
@amadeus Yea as @mispy said there is no other text in there aside from what comes from your tweets (and replies). Is it possible you archived 0xabad1dea's tweets, as per the example command?
Good news though: There is a stopwords.txt file you can use to exclude from the text modeling system any words you don't want.
It looks like both the phrases "transphobic" and "forcibly assault" are in 0xabad1dea's tweet archive, so I think you're on the money, @negatendo.
I'm nearly 100% positive I consume
d the proper file. I just ran through the steps listed in the example readme: https://github.com/mispy/ebooks_example/blob/master/README.md
Finally had time to look back into this, figured out the issue - I had to change some other aspects of bots.rb
around to source the proper file model file. Perhaps the README in the example should reflect this? Not a big deal though, thanks again everyone though!
Just re-read the readme and realized it does mention these things, although I didn't realize it. Carry on.
So, I was playing around with this on a joke account with a friend who's a developer/engineer, and it generated two @ replies that where pretty bad, and were not at all seemingly contextually relevant to the user's tweet history. I've shut down the bot for now, but just as an FYI, the tweets it generated were:
"Ahh youth, when is it a crime to not forcibly assault someone"
"While I'm not transphobic, though you'd have to saw through and cut yourself on"
Any insight into how/why it may have come up with these?