Open jiko opened 11 years ago
Awesome! Could you add these projects to the Resources issue, #11 ? We're compiling useful resources over there.
Sure! Thanks for the encouragement. I have to clean up my code some and then I'll post to the Resources issue.
So far I've focused on research and finding useful tools for the job. I've named my bot Gen Austen, and gathered Jane Austen's six novels for my corpus. I got my working title Distraction and Distractibility from an NPR story about a collaboration between an English professor and neuroscientists to study the effect of close reading on the brain. Thus far I've worked with the Node modules 'markov' and 'pos' but not generated anything substantial.
OK. I have ten 50k+ word novels in the output
directory of my NaNoGenMo fork. I used all six of Jane Austen's novels, @leonardr's In-Dialogue, @ianrenton's NaNoGenMo, advanced anagramming, the example markov.go, and my own JavaScript, Python, and Bash. The last two novels use fanfic from FanFiction.net and Archive of Our Own. README.md
gives a brief explanation of each.
Thank you @dariusk for putting this together and @leonardr and @ianrenton for the code. Also, thank you @leonardr for BeautifulSoup. I used it for my first paid programming gig back in 2007. The Alice in Wonderland references in the documentation helped soothe my nerves.
Hooray! Thanks for participating!
I've previously done Markov-chain text-generation for Twitter bots, using some Python code found on GitHub.
Here's a list of my relevant GitHub projects, in chronological order. They all build on one another.
I would really like to use a Markov approach combined with some natural language processing to make grammatically correct sentences. Maybe that is too ambitious. Grabbing random sentences or paragraphs from related texts seems like a more doable approach.