Open twentylemon opened 3 years ago
I would also like to see the idea of seeding the markov chain with the !markov
command. Such as !markov @human guys, today I learned that
...
nltk has ngrams
which we can use, we already use nltk as a dependency
Also relevant reading: https://softwareengineering.stackexchange.com/questions/64288/storing-n-gram-data
https://github.com/Chippers255/channel_police/blob/main/extract.py is some code to fetch channel history, for the initial building of the markov transition matrices
See #701 for a preferable approach to markov chains
This has a lot of potential.
We have a database now. Let's use it.
For each user, we can break down messages as they come in to build their transition matrix over time, storing the matrix in the database. So, if a human says I like food very much, we could store the n-grams I like food, like food very and food very much, or whatever
n
we want. Naturally largern
builds better sentences, but our data is likely to be fewer words in general, plus we do have to store it all.Then, with those transition tables, we could have a command to mimic a given user.
We could also have additional arguments to
!markov
, like a starting point for the sentence to generate, or how many words to generate.