NaNoGenMo / 2016

National Novel Generation Month, 2016 edition.
https://nanogenmo.github.io
162 stars 7 forks source link

Stopword-delimited phrase chaining #118

Open enkiv2 opened 7 years ago

enkiv2 commented 7 years ago

Riffing off an idea by @JKirchartz, I separated out sentences from the Lovecraft corpus, split sentences into stopword-delimited phrases, and used my phrase-chain code from October to rearrange the phrases into a new form.

Phrase-chain chooses a word from each phrase as a representative based on a configurable heuristic (in this case, I chose the word from each phrase whose frequency is closest to the global average frequency for all words in the corpus), and then chains the phrases together markov-style by these representatives. (In other words, we classify phrases by their representatives, and each phrase is followed by another phrase that had, in its original context, been followed by a phrase with that same representative, chosen at random but weighed by frequency.)

Completed novel: https://github.com/enkiv2/misc/blob/master/nanogenmo-2016/lovecraft-stopword-phrasechain.md Phrasechain description and implementation: https://github.com/enkiv2/misc/tree/master/phrasechain