Open janelleshane opened 4 years ago
I really like 'victorian' - the absurdity mixed with patches of older style language work well together. It reminds me of reading Victorian era novels where I've lost track of the rambling text but I can still pick out comical descriptions of things.
In November 2017 I crowdsourced a dataset of over 10,000 first lines of novels, and trained syll-rnn to generate new examples. It struggled to remain coherent over more than a few words at a time, but it did produce some highlights:
Highlights:
and Lowlights:
For NaNoGenMo 2019, I decided to revisit this dataset with a larger, more-powerful neural net called GPT-2. Unlike most of the neural nets that came earlier, GPT-2 can write entire essays with readable sentences that stay mostly on topic (even if it has a tendency to lose its train of thought or get very weird). I trained the largest size that was easily fine-tunable via GPT-2-simple, the 355M size of GPT-2.
For a writeup of these results: https://aiweirdness.com/post/189170306297/how-to-begin-a-novel
For the original training dataset, plus four different output texts, each of which passes the 50k character minimum: https://github.com/janelleshane/novel-first-lines-dataset
Highlights:
Lowlights:
For previews of the 4 raw output files, see below.
An explanation: Although the sentences are independent in my training data, GPT-2 is used to large blocks of text that go together. The result is if I prompt it instead with, say, a line from Harry Potter fanfic, the neural net will tend to stick with that vein for a while. The raw output files therefore have different flavors. I chose a temperature setting of 0.8, and used the default truncation setting of 0.
ancient
prompt: "It is a terrible, terrible idea, even if entirely accidental, to talk to one of the Ancient Ones."
Example raw output:
ponies
prompt: "Twilight Sparkle was out of cupcakes."
Example raw output:
potter
prompt: "Harry glared at Snape, vigorously stirring the bowl of frosting."
Example raw output:
victorian
prompt: "It is a truth universally acknowledged"
Example raw output: