dariusk / NaNoGenMo

National Novel Generation Month. Because.
183 stars 16 forks source link

Great idea! #20

Open jiko opened 11 years ago

jiko commented 11 years ago

I've previously done Markov-chain text-generation for Twitter bots, using some Python code found on GitHub.

Here's a list of my relevant GitHub projects, in chronological order. They all build on one another.

I would really like to use a Markov approach combined with some natural language processing to make grammatically correct sentences. Maybe that is too ambitious. Grabbing random sentences or paragraphs from related texts seems like a more doable approach.

dariusk commented 11 years ago

Awesome! Could you add these projects to the Resources issue, #11 ? We're compiling useful resources over there.

jiko commented 11 years ago

Sure! Thanks for the encouragement. I have to clean up my code some and then I'll post to the Resources issue.

jiko commented 11 years ago

So far I've focused on research and finding useful tools for the job. I've named my bot Gen Austen, and gathered Jane Austen's six novels for my corpus. I got my working title Distraction and Distractibility from an NPR story about a collaboration between an English professor and neuroscientists to study the effect of close reading on the brain. Thus far I've worked with the Node modules 'markov' and 'pos' but not generated anything substantial.

jiko commented 10 years ago

OK. I have ten 50k+ word novels in the output directory of my NaNoGenMo fork. I used all six of Jane Austen's novels, @leonardr's In-Dialogue, @ianrenton's NaNoGenMo, advanced anagramming, the example markov.go, and my own JavaScript, Python, and Bash. The last two novels use fanfic from FanFiction.net and Archive of Our Own. README.md gives a brief explanation of each.

jiko commented 10 years ago

Thank you @dariusk for putting this together and @leonardr and @ianrenton for the code. Also, thank you @leonardr for BeautifulSoup. I used it for my first paid programming gig back in 2007. The Alice in Wonderland references in the documentation helped soothe my nerves.

dariusk commented 10 years ago

Hooray! Thanks for participating!