NaNoGenMo / 2016

National Novel Generation Month, 2016 edition.
https://nanogenmo.github.io
162 stars 7 forks source link

Entry: The Days Left Forebodings and Water #119

Open lizadaly opened 7 years ago

lizadaly commented 7 years ago

Blackout generates pages of text from book or newspaper scans in the style of Newspaper Blackout Poetry, popularized by Austin Kleon (c.f. A Humument by Tom Phillips).

Blackout does the following:

  1. Take, as input, an image of text, from a newspaper or book.
  2. Run OCR against the image, identifying the words and their bounding boxes.
  3. Feed the extracted text into a natural language parser, categorizing each part of speech.
  4. Given one of many randomly selected Tracery grammars, select words from the current page that match the parts of speech of that grammar.
  5. Draw around those words and "scribble" out all other text on the page image.
  6. Output the final page as a new image.

Pen width, line length, line direction, number of strokes, and stroke opacity are all randomly fuzzed. The pen color is always black, except in rare cases it is blood red.

The work:

"The Days Left Forebodings and Water"

The source material is A Vindication of the Rights of Women by Mary Wollstonecraft (1792).

Read The Days Left Forebodings and Water. 45 pages long, consists of entries that were generated randomly, but hand-picked and ordered on November 9, 2016.

(The full NaNoGenMo entry of ~50,000 words is a 9.3GB PDF of nearly 10,000 pages and is no longer available for download.)

Full source code and more examples

enkiv2 commented 7 years ago

Oh man, this is so cool. Blackout poetry has been done before but only as redaction of particular words as far as I can tell; actually scribbling out portions of an image of a page is a really neat idea.

On Mon, Nov 21, 2016 at 8:22 PM Liza Daly notifications@github.com wrote:

Blackout https://github.com/lizadaly/blackout generates pages of text from book or newspaper scans in the style of Newspaper Blackout Poetry http://newspaperblackout.com/, popularized by Austin Kleon https://twitter.com/austinkleon (c.f. A Humument http://tomphillipshumument.tumblr.com/ by Tom Phillips).

Blackout does the following:

  1. Take, as input, an image of text, from a newspaper or book.
  2. Run OCR https://github.com/jflesch/pyocr against the image, identifying the words and their bounding boxes.
  3. Feed the extracted text into a natural language parser https://spacy.io/, categorizing each part of speech.
  4. Given one of many randomly selected Tracery https://github.com/aparrish/pytracery grammars, select words from the current page that match the parts of speech of that grammar.
  5. Draw around those words and "scribble" out all other text on the page image.
  6. Output the final page as a new image.

Pen width, line length, line direction, number of strokes, and stroke opacity are all randomly fuzzed. The pen color is always black, except in rare cases it is blood red.

The work: "The Days Left Forebodings and Water" https://github.com/lizadaly/blackout/blob/master/images/title.png?raw=true

The source material is A Vindication of the Rights of Women https://en.wikipedia.org/wiki/A_Vindication_of_the_Rights_of_Woman by Mary Wollstonecraft (1792).

Read The Days Left Forebodings and Water https://s3.amazonaws.com/worldwritable/nanogenmo2016-short.pdf. 45 pages long, consists of entries that were generated randomly, but hand-picked and ordered on November 9, 2016. https://github.com/lizadaly/blackout/blob/master/images/4.png?raw=true https://github.com/lizadaly/blackout/blob/master/images/3.png?raw=true https://github.com/lizadaly/blackout/blob/master/images/7.png?raw=true

(The full NaNoGenMo entry of ~50,000 words is a 9.3GB PDF https://s3.amazonaws.com/worldwritable/nanogenmo2016-9g-long.pdf of nearly 10,000 pages. You almost certainly do not want to download it.)

Full source code and more examples https://github.com/lizadaly/blackout

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NaNoGenMo/2016/issues/119, or mute the thread https://github.com/notifications/unsubscribe-auth/AAd6GbEpND7oXjsbisp46O3DmASu_3EAks5rAkPUgaJpZM4K45n3 .

anjabeth commented 7 years ago

Whoa, I love this! Particularly what you did with making it look like "real" blackout poetry with the penstrokes and everything. Haven't read the whole "The Days Left Forebodings and Water", but I'm excited to (and what a great title)