NaPoGenMo / NaPoGenMo2016

National Poetry Generation Month 2016
4 stars 0 forks source link

Simplest Possible Product #3

Open MichaelPaulukonis opened 8 years ago

MichaelPaulukonis commented 8 years ago

I don't have major big ideas this year, none that I have any chance of completing in the midst of everything else I'm doing.

So -- I'm going to plan connect up a "poem" generator to a Tumblr-bot.

The generator will likely be a minor elaboration on what I did last year -- intake, processing, output. The wire-up will give me daily results to look at an tweak, as an on-going goad.

So: no giant contributions to the art.

As usual: incremental advances.

MichaelPaulukonis commented 8 years ago

So, I've been looking at the Lexeduct code that Chris Pressey started last year. I didn't look into it enough at the time, and my work with it was at cross-currents to its ideology (my work last year was in the gh-pages branch, and "worked", even though it doesn't fit the main model in the master branch).

I think wrangling that understanding and applying it to want I currently want to do will be too time-consuming (although profitable).

So, I'm going to do the Simplest Thing That Could Possibly Work.

  1. static text generator posts to Tumblr
  2. text generator becomes non-static
  3. elaborate and iterate on step 2
  4. end-goal includes ingestion of source material from online news
MichaelPaulukonis commented 8 years ago

http://poeticalbot.tumblr.com/

https://github.com/MichaelPaulukonis/NaPoGenMo2016

MichaelPaulukonis commented 8 years ago

Initial non-static implementation is a headless version of Edde Addad's jGnoetry using my already derived work.

Also, it's running live on Heroku with a once-a-day scheduler.

So without further intervention -- we'll have a poem-a-day.

Futher elaborations will be:

MichaelPaulukonis commented 8 years ago

Some examples:

quatrain 0 1 98 1

The filmed culture teaches as reporting On average annual increase the decisions They would otherwise ensure. These students To use of the challenge in america.

howl 29 39 26 6

Cut out of the ills

Infringements

A bag find those holy want is true my perspective, even though not something sweet hue is. Love groan

It easier article had mastered of thousands just cut out the

Gentle original a bag thought thee, increasingly copy as and there you. Shake gently manipulate Pavements,, even if

I schools and the United a dream the order in the web-log, beauty’s the sun and put them to mine. Next my article audit the bag. Copy conscientiously in a canker streetlight

In chapter a newspaper. Then one particularly seem like endless there is his triumphant prize. And, even thought where you’re order flowers

Saw that makes a newspaper. The order which cannot just and to make of a world where is up, even though unappreciated by the breath

That coughs o. Take a kid. If thou. Take and honor make your self alone sensibility, the worst one after the [etc. etc. etc.]


There's something weird with the opening of the Howl template; have to look at that some more.

The corpus weights should be able to select 1 only, some of the time.

The title is the template name, plus the corpus weights. Since nobody knows what those are, it's pretty much magic numbers, but that's fine by me for a naive algorithm (I'm reminded of how Edde Addad would use GUIDs as names. Unwieldy, but... unique.).

Looking at taking initial words, highest frequency words, other options.

MichaelPaulukonis commented 8 years ago

I think the Howl template is... too big? It's fun, but it's a blast of words, and goes on too long.

Last night I added a new template - derived from the start of a MSDN page, no less, source texts of The Wizard of Oz and a script to Apocalypse Now, a new title algorithm, and some tweaks. The script may be odd. OTOH, that may lend some interesting frisson. Maybe if there were more scripts?

And again -- all of this is markovian. Which has its pleasantries, but also wears thin and obvious rather quickly.


CALCULATED CORRECTLY THAT

He calculated correctly that helped Amplify text. Today you have developed.


I want to get some more title algorithms and switch them up, and get more of the parameters randomized. They won't be big changes -- but will affect capitalization and punctuation a bit. Also, have the ability to pick one or two texts from the corpus; the current algorithm can randomly assign some to 0%, but I'd like most of them to be 0% on occasion.

The title algorithm tracks the most common words in the text, selects some of them (randomly, but weighted for 4..10 if that many words are present). It was biased towards common words like "and, of, or the, is, an" etc - so I had it ignore words < 4 characters. Naive (since "this, that, those" slip through), but quickly workable. There's a big fat library that can pick out topic words, but at the moment it seems a large dependency. OTOH, there's a lot of room in heroku to add in libs....

I've turned the bot on to fire hourly during development, but it will go back to once-a-day when the dust settles.

I want some other poetry generation algorithms, and really want to tweak the template generation and processing -- would like to do some fill-in-the-blanks. That is, have words present in the template that are spit back out. Currently, any non-template token in the template is treated as a reason to templatize the input text.

MichaelPaulukonis commented 8 years ago

Just be clear, the code is auto-posting once an hour to http://poeticalbot.tumblr.com/