dariusk / NaNoGenMo-2015

National Novel Generation Month, 2015 edition.
340 stars 21 forks source link

Interlude: (Un)Sound Structures #135

Open jseakle opened 8 years ago

jseakle commented 8 years ago

I needed a break from trying to solve hard problems in my primary NaNoGen, so I bashed this thing out real quick.

The idea is to take the structure of an existing document, and render it into a sound poem(?)/mess/thing. Hopefully the result is fun to read aloud, and also somewhat intriguing to look at, in that you can possibly figure out some of what was going on in the original, but really not very much.

For instance, here is the output on a favorite Mervyn Peake poem:

Gso Thexad Toshrr Saofa

Klod imo stumo nsohyn rpef rmeti ckoso iwha crwanc tianuv Temo long phoon raathens murnh ock fiasp ic chiebb? Ry, wen kbyos xpawaring rad flera noda nechis Stomnect ri uitrr Ksy pahanum westsan dret i staotebr bepsr, Int oht bly cten Esseresesi uccinc ylol udo sis Ppo dersto htir o dabl yith ig yca rorvs Atyknd eng raomln An la-intablhis arslutish igis.

Gleh eti flyka llidar tcih seho he mpeys urh nser? Ytidi brenor seds cuotab garc iem an lsiel pemiw? Coypit stev spt ry kehahayt, detacr reqi Ysi hobramm al e trasul ppai hactibe Rciti yeormoh cesthot ckiha; ew rena rcue neef Er hutres nasmeti nsen asks tir Censf thustoh sans shu reo. Lonewh iest imbac Sech yocull nint mmuyl meotmit cehif nteso Er cahuch ysaneri dsaemf tboab ntersn dena ripi. Reri cteowam uy birr eds liscot togsheesc mannig Yglu stifi deskeang rsandoc bsuh rda ckeet Ar sat rmi nugi stos atot daest mambaano, Ttar scuto elc ndeha.

Bbog uty stu muymc an nnut, sci msec an crand, Ffo nsedg breh tostil anc cka tef ung rear; Ein tdost sho miebinung utasoyhh ethedp phir Ngeyst ppis ysan naeplir nsel rma fept con nusa. Wki hablt niw lunggi her nliti bbiur lod sept Org ndaly cra rootl empo ttaet ndo wegb wong miwurt, Iw figlp gsid flehn umaeh uesh riping cagr Al llardyeyg stetitg, buwho cltiang rro rullrynd Ad sogher nsersd, dars gohurw trew, himminpifis, Grepruh twoax seprah dyncr.

Lsov at rwed ymes tlof ezbd nyhrapleh Onsadanon vati hik ipdyadond'd bbeblw selih? Rsa rygunas wekun ggoh nomanc nseng lpumbv, Tedmetan necr stiat athopaes odoseceifs, Gavy uing u mondust ceggn uh cophinafl, Rho mandir shadohtn ong nti tireyaad, Nsa wiov it msoaltisc osp nti craar us iglah.

Rsef ed grar grar bleh yfla argtecor ost stup? Id ac narw sich ce hesmtont, se tewn nowrhabr Nyb ef ak ehsolur ens Wulo nil viksil Sint erde pempsost stdeitt est heatraserr smowa, Ock um gest heyl pe tilnditt, ri nicr senslotc Ndipy sastoff creth egist tro nrmymseit. Sha sersra ntis et sitsach eh bla soec Ogritog fri voong ar Rirkup ehn Dalsiee Soapholi, Ngie obr Deradrse - eys mahuaz, Eds omile sqiimberl leongeth lloc feif nelmorf.

Tfi phisod ronersutr ost gho rsant lilalbars.

It does this by replacing diphthongs with diphthongs, consonants with consonants, and vowels with vowels, chosen at random from English frequency charts.

The one major improvement that I could make, but likely won't, is to divide the diphthongs up into beginners, middles, and enders, because e.g. "gsid" and "ntersn" possibly might, for some people, fail the "fun to read aloud" test.

Anyway, let's grab that COMPLETED label! I've run the program on all 111k words of Cathrynne M. Valente's latest novel Radiance, to produce my first real NaNoGenMo success in three years of attempting to participate: Sound Radiance!

I picked Radiance because it has a lot of structural variety, and also because I am currently only partway through, and am somewhat tickled by the idea of staring at later parts of Sound Radiance and trying to derive spoilers therefrom.

One issue this project brings to mind is the legality of heavily lossy encodings of copyrighted material. Presumably if I rot13'd the text of the novel, that would be a no-no, but I feel fairly confident that even extensive cryptanalysis could not recover significant portions of the original text from the above file. Small words yes, and maybe a few oft-repeated names, since the algorithm avoids replacing a letter with itself, but I think that it is not, in general, reversible. But that's another somewhat interesting question - does it matter how hard it is? Does it matter whether I was pretty sure it was impossible, if it turns out not to be?

Here's the code, which I hereby declare to be public domain. As I did this for a quick break in a couple hours, it is decidedly Not Good.

hugovk commented 8 years ago

Have a COMPLETED!

Merrsoncs © 2015 sr Tiyrinshmi T. Hifasho

cpressey commented 8 years ago

Nice! If, in the next few weeks, you want to take another break, you could throw [parts of] this through a voice synthesizer -- seems fitting for a sound poem -- and submit it to NaOpGenMo...

jseakle commented 8 years ago

Man, you ruined my cool surprise! :P

I had to get to sleep last night, but I was planning to do speech synthesis to this today. Didn't know about NaOpGenMo, though, that is neat and I will consider submitting.

..how long until National "National Generating Month" Generating Month??

cpressey commented 8 years ago

Oh snap, sorry about that! But it seemed like the obvious next step.

Yo dawg, I heard you liked National Generation Months, so I * faints *

jseakle commented 8 years ago

No worries :)

Here's the poem!

MichaelPaulukonis commented 8 years ago

I think that it is not, in general, reversible.

Since you replaced "diphthongs with diphthongs, consonants with consonants, and vowels with vowels, chosen at random from English frequency charts" -- running the same process on the words, comparing the variants to dictionary, and then extend that to general english frequency n-gram tables for word order, and the text might fall back into place. I also assume, since it is so easy for me to make assumptions when I have no intention of coding it up, that once portions of the text are "figured out" the tables can be updated with in-document frequencies. But that might include some human overview.

Isn't that roughly what mobile-swipe-style keyboards do -- take all of the letters in the path and do a lookup on the (probable) words?


Love the .mp3!