NaNoGenMo / 2019

National Novel Generation Month, 2019 edition.
97 stars 5 forks source link

Dublin Walk #102

Open mewo2 opened 4 years ago

mewo2 commented 4 years ago

A Nano-NaNoGenMo entry - a 256 character program that walks through the lexicons of Joyce's Ulysses and Dubliners (neither work alone has enough vocabulary), choosing words which have similar sets of letters to the previously selected word.

cat 4300-0.txt 2814-0.txt|python3 -c "import sys,re;w=set(re.split(r'[^A-Za-z\'’.\-,?\!;:]',sys.stdin.read()));l=['Once']
while w:l.append(min(w,key=lambda x:len(set(x)^set(l[-1]))));w.remove(l[-1])
print('DUBLIN WALK\nMartin O\'Leary\n\n'+' '.join(l))"

The output starts by gradually drifting between sounds and spellings:

DUBLIN WALK Martin O'Leary

Once Once One One, ne, even, envy, evenly, evenly velly evly eely yell alley yale early layer really yearly rarely Rarely Really Really? Ready? Ready Read Red de deed fed feed defeegee fagged gagged adage aged dagger regarded regard daggered degraded dragged agreed ragged grade guarded dueguard argue uneager ungrate guarantee inaugurate arruginated inaugurated treading gradient redpanting departing pretending dreeping grinned reddening ringed rendering reigned grinned, diner, dinner, friend, finder friend inferred refined indifferent different interfered frittered drifted retired tried tired retrieved rivetted diverted driver derived drive derive deprive ripped pride prided peppered deeper peered petered departed parted rerepeated depart repeated trade retarded ratted tarred retreated tattered aerated darted treated tread reader dear read dare dearer reared adread reread dared eared Heard Herald ladder laddered leader raddled Wellread Waddler Wellmannered alderman alderman. marmalade. alarmed. dreamed.

By the end it has devolved into apparently random selections of Joycean vocabulary:

Metempsychosis? wisdom? doggybowwowsywowsy! Pwfungg! Magmagnificence! M’Guckin! Burke’s! Hercules: Heartbeats: Characters: Trieste-Zurich-Paris breakfast-cup perturbations, fingerjoints, Pokethankertscheff, MacDermott, Driscoll, Dubliners. Bullockbefriending. Bullockbefriending bullockbefriending Mecklenburgh Aldborough Fishguard-Rosslare Caruso-Garibaldi oil-rag.’ Koh-i-Noor Borris-in-Ossory. propinquity: Favourite David? David: O’Donovan O’Donoghue O’Donoghue. Daughter’s. Ruttledge’s Olhausen’s, establishment; l’zamatejch jigtime. exhumed. Excitedly. Indignantly. Vladivostok. Chapelizod imprevidibility perfectible, brightwindbridled, handsomemarriedwomanrubbedagainstwidebehindinClonskea contransmagnificandjewbangtantiality. Nationalgymnasiummuseumsanatoriumandsuspensoriumsordinaryprivatdocentge Dungdevourer! Goutte-d’Or, Two-and-four, salmongaffs, Szesfehervar, Yrfmstbyes. keyboard: Godblazeqrukbrukarchkrasht! Schwanzenbad-Hodenthaler, Guelph-Wettin! GUTENBERG-tm BRAYDEN, ALEXANDER RUMBOLD, RUMBOLD: RUDOLPH: DLUGACZ: RAGAMUFFINS: ZOE-FANNY:

Full text is available in this gist

enkiv2 commented 4 years ago

This is wonderful.

On Wed, Nov 20, 2019 at 8:40 PM mewo2 notifications@github.com wrote:

A Nano-NaNoGenMo entry - a 256 character program that walks through the lexicons of Joyce's Ulysses and Dubliners (neither work alone has enough vocabulary), choosing words which have similar sets of letters to the previously selected word.

cat 4300-0.txt 2814-0.txt|python3 -c "import sys,re;w=set(re.split(r'[^A-Za-z\'’.-,?!;:]',sys.stdin.read()));l=['Once']

while w:l.append(min(w,key=lambda x:len(set(x)^set(l[-1]))));w.remove(l[-1])

print('DUBLIN WALK\nMartin O\'Leary\n\n'+' '.join(l))"

The output starts by gradually drifting between sounds and spellings:

DUBLIN WALK Martin O'Leary

Once Once One One, ne, even, envy, evenly, evenly velly evly eely yell alley yale early layer really yearly rarely Rarely Really Really? Ready? Ready Read Red de deed fed feed defeegee fagged gagged adage aged dagger regarded regard daggered degraded dragged agreed ragged grade guarded dueguard argue uneager ungrate guarantee inaugurate arruginated inaugurated treading gradient redpanting departing pretending dreeping grinned reddening ringed rendering reigned grinned, diner, dinner, friend, finder friend inferred refined indifferent different interfered frittered drifted retired tried tired retrieved rivetted diverted driver derived drive derive deprive ripped pride prided peppered deeper peered petered departed parted rerepeated depart repeated trade retarded ratted tarred retreated tattered aerated darted treated tread reader dear read dare dearer reared adread reread dared eared Heard Herald ladder laddered leader raddled Wellread Waddler Wellmannered alderman alderman. marmalade. alarmed. dreamed.

By the end it has devolved into apparently random selections of Joycean vocabulary:

Metempsychosis? wisdom? doggybowwowsywowsy! Pwfungg! Magmagnificence! M’Guckin! Burke’s! Hercules: Heartbeats: Characters: Trieste-Zurich-Paris breakfast-cup perturbations, fingerjoints, Pokethankertscheff, MacDermott, Driscoll, Dubliners. Bullockbefriending. Bullockbefriending bullockbefriending Mecklenburgh Aldborough Fishguard-Rosslare Caruso-Garibaldi oil-rag.’ Koh-i-Noor Borris-in-Ossory. propinquity: Favourite David? David: O’Donovan O’Donoghue O’Donoghue. Daughter’s. Ruttledge’s Olhausen’s, establishment; l’zamatejch jigtime. exhumed. Excitedly. Indignantly. Vladivostok. Chapelizod imprevidibility perfectible, brightwindbridled, handsomemarriedwomanrubbedagainstwidebehindinClonskea contransmagnificandjewbangtantiality. Nationalgymnasiummuseumsanatoriumandsuspensoriumsordinaryprivatdocentge Dungdevourer! Goutte-d’Or, Two-and-four, salmongaffs, Szesfehervar, Yrfmstbyes. keyboard: Godblazeqrukbrukarchkrasht! Schwanzenbad-Hodenthaler, Guelph-Wettin! GUTENBERG-tm BRAYDEN, ALEXANDER RUMBOLD, RUMBOLD: RUDOLPH: DLUGACZ: RAGAMUFFINS: ZOE-FANNY:

Full text is available in this gist https://gist.github.com/mewo2/3e6ad222e516367ada96fbd8c9a1106d

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NaNoGenMo/2019/issues/102?email_source=notifications&email_token=AADXUGLWMVOZMGBAP2QKLLDQUXRKXA5CNFSM4JP3V3R2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H26436Q, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADXUGJXETCSQYOLLMK6JOLQUXRKXANCNFSM4JP3V3RQ .

mewo2 commented 4 years ago

A look at what the code does, since some people have asked.

First, a slightly decompressed version of the Python program:

import sys, re
words = set(re.split(r'[^A-Za-z\'’.\-,?\!;:]', sys.stdin.read()))
novel = ['Once']
while words:
    novel.append(min(words, key=lambda word:len(set(word) ^ set(novel[-1]))))
    words.remove(novel[-1])
print('DUBLIN WALK\nMartin O\'Leary\n\n' + ' '.join(novel))

We start by importing the necessary libraries: sys for reading input and re for regular expressions:

import sys, re

The next line reads the input, splits it into words, and throws them in a set datastructure, meaning that only one copy of each unique word is retained. The regular expression for splitting (r'[^A-Za-z\'’.\-,?\!;:]') allows words to consist of letters and some punctuation. I deliberately left out any punctuation that needs to match in pairs, like brackets or quotation marks.

words = set(re.split(r'[^A-Za-z\'’.\-,?\!;:]', sys.stdin.read()))

Then we initialise our novel as a list containing the string "Once" - this actually gets doubled in the output because of a bug - I wasn't bothered about fixing it.

novel = ['Once']

The main loop proceeds by taking the last word in the novel (novel[-1]) and finding the "closest" word to it in the working set. This gets appended to the novel, and removed from the working set. The "closeness" of words is defined by the symmetric difference of their sets of characters - the number of unique characters which appear in either word but not both. To calculate this we use the slightly obscure ^ operator in Python.

while words:
    novel.append(min(words, key=lambda word: len(set(word) ^ set(novel[-1]))))
    words.remove(novel[-1])

Finally, we join all the words into one string, and print it with a title block. I didn't see a good way of introducing any line breaks, so the whole thing is one 51,185-word paragraph.

print('DUBLIN WALK\nMartin O\'Leary\n\n' + ' '.join(novel))

To make it fit in 256 characters, the variable names get squished to one character and all the whitespace removed. This then gets wrapped up in a UNIX command line, which feeds it the two texts (Ulysses is 4300-0.txt and Dubliners is 2814-0.txt), combined using the UNIX cat utility. It calls python3 explicitly, but I don't think anything here actually requires it - perhaps I could have saved a character by calling python instead.

cat 4300-0.txt 2814-0.txt|python3 -c "import sys,re;w=set(re.split(r'[^A-Za-z\'’.\-,?\!;:]',sys.stdin.read()));l=['Once']
while w:l.append(min(w,key=lambda x:len(set(x)^set(l[-1]))));w.remove(l[-1])
print('DUBLIN WALK\nMartin O\'Leary\n\n'+' '.join(l))"
nickmontfort commented 4 years ago

Beautiful! Given the poetic drive of this miniature machine, I fully approve of the way this programs uses intense computational resources.

enkiv2 commented 4 years ago

I ran speech synthesis on this & uploaded it to youtube (with a video track built out of random processed clips), since I thought it would have an interesting effect when spoken: https://www.youtube.com/watch?v=Sk2Eu9cAd7o&feature=youtu.be

(It's sort of taking forever to process, but it is four hours and change long)

On Thu, Nov 21, 2019 at 3:49 PM Nick Montfort notifications@github.com wrote:

Beautiful! Given the poetic drive of this miniature machine, I fully approve of the way this programs uses intense computational resources.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NaNoGenMo/2019/issues/102?email_source=notifications&email_token=AADXUGKL5ICLK54UYPJDE5DQU3X5PA5CNFSM4JP3V3R2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3TIZA#issuecomment-557266020, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADXUGNO2PGJSEYLP75KQQTQU3X5PANCNFSM4JP3V3RQ .

mewo2 commented 4 years ago

@enkiv2: I would greatly prefer if you didn't do this without asking. As an artist, I like to maintain control over my work and its presentation. I'd appreciate it if you removed the video.

enkiv2 commented 4 years ago

Alright, will do.

On Mon, Nov 25, 2019 at 3:31 PM mewo2 notifications@github.com wrote:

@enkiv2 https://github.com/enkiv2: I would greatly prefer if you didn't do this without asking. As an artist, I like to maintain control over my work and its presentation. I'd appreciate it if you removed the video.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NaNoGenMo/2019/issues/102?email_source=notifications&email_token=AADXUGM4ZJKF2SN7BLNZM33QVQY27A5CNFSM4JP3V3R2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFDWN7I#issuecomment-558327549, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADXUGNX45XDRTTSFTAC5TDQVQY27ANCNFSM4JP3V3RQ .

mewo2 commented 4 years ago

Thanks!

arnicas commented 4 years ago

Reminds me of Of Oz the Wizard :) https://vimeo.com/150423718