Hexameter Epos Generator: HEXEGEN

eseyffarth commented 9 years ago

I'm in. I'd like to generate an epos in hexameter verse, either from a Gutenberg book (or several Gutenberg books) or just from a bag of words. Gonna use CMUdict for this, whoohoo! I wonder if any/enough sentences matching a hexameter pattern occur in natural, human-written, English, non-hexameter literature? Guess I'll find out.

MichaelPaulukonis commented 9 years ago

Keep us posted!

eseyffarth commented 9 years ago

Here you go:

imageries wilderness carignan westerner gaudio mayo claudius amasa calendar lickety costumers teter peacefully venema univar standerford arborville wendell devenny pettersen cardigan rosellen optical's westrum...

Next steps are:

transgress word boundaries when matching the syllable stress, so that a foot does not consist of exactly one word
insert punctuation marks?
when the foot/word problem is solved, match patterns against a large existing corpus, e.g. from Gutenberg

Here's my repo: https://github.com/ojahnn/NaPoGenMo15-HEXEGEN

enkiv2 commented 9 years ago

My haiku generator uses a regex to naiively count syllables (assuming that vowels surrounded on all sides by either whitespace or consonants constitute syllables). If you do that, you can count out lines with between twelve and seventeen syllables; however, I'm not sure how you'd distinguish short from long syllables. Wikipedia claims a short syllable has one short vowel sound (no dipthong) and no coda, so I suppose you could isolate them separately and stick to vowels that are typically short in english for short syllables. My guess is that you'll come up with a handful of 'matches' in any large text, if only because these mechanisms for counting syllables and determining their length are fallible.

... or so I was going to say, before you updated and said that you succeeded. Congratulations!

On Wed, Apr 8, 2015 at 3:50 PM Michael Paulukonis notifications@github.com wrote:

Keep us posted!

— Reply to this email directly or view it on GitHub https://github.com/NaPoGenMo/NaPoGenMo2015/issues/10#issuecomment-91016817 .

eseyffarth commented 9 years ago

Aw, thanks!

eseyffarth commented 9 years ago

Here's what I found out: You can't match the RegEx I defined for hexameter verse against existing book sentences. This is because:

Basic function words are not in the CMUdict; I tried to work around this by giving any one-syllable word the stress 0 (unstressed)
CMUdict gives the stress for each individual word, but function words are often unstressed when they occur in a sentence (and have stress 1 in CMUdict) - I couldn't find enough 0-stress syllables in existing sentences by translating each individual word.

I think this also means that it's very difficult to generate feet that consist of more than exactly one word.

I was thinking about using Twitter as a source for sentences instead of Gutenberg, but that won't solve my stress problem. It might still work a bit better due to the corpus being bigger.

My stick-words-together approach now alternates between dactyls and spondees in the first 4 feet:

hamptonshire taxiing lobbyist agony bredeson hornyak andrews noisiest goodsell flashdance berkelman grandis voyager's bighorn rehfeld howison rosensweig flegal trimedyne circus's doomsday holsapple ravages thwarting trachea bootleg holiness ashline cormorants courson non-stop northbrook invitron's barnhouse ciaccio virna

Gotta find an intelligent way of inserting punctuation now. I guess I'll start by reading the punctuation distribution from the Iliad or Aeneid and copying each punctuation mark to a similar location in my text.

If I have enough time, I might do some POS filtering stuff to try to get some syntactic coherence into my poems.

eseyffarth commented 9 years ago

I made something that may not exactly be a Poetry Generator, but I thought it might still fit here. It's a @VagueLyrics bot, tweeting vague versions of song lines. It currently replaces 3 out of 4 nouns with "something", but I have plans to make it a little more interesting soon.

NaPoGenMo / NaPoGenMo2015

Hexameter Epos Generator: HEXEGEN #10