NaNoGenMo / 2016

National Novel Generation Month, 2016 edition.
https://nanogenmo.github.io
162 stars 7 forks source link

Entry: Superphreak #24

Open dluman opened 7 years ago

dluman commented 7 years ago

This is a tentative note to indicate my intent to attempt a novel this November.

Inspired by the massive amount of data on phreak/hacker culture at textfiles.com, I plan to figure out how text might be run though the process of "boxing," as if the central consciousness of the novel is trying to wardial or autodial lists of numbers occasionally getting a respondent and interacting with them in some given conceptual way given the "box" they have in use at a given time.

Largely still an amorphous idea, I bring my concept to the hive mind as a note to say "I am doing this."

I've loved reading the logs these past years. Why not do it?

I've also given some thought to the concept of "trashing," and using geo-located dumpsters as a kind of marker for an adventure/collaged narrative from the various documents or manuals that my "phreaker" finds on their various trashing runs. As such, physical security could be a variable in terms of site selection, deterring my narrator from attempting to "dumpster dive."

The setting would almost certainly need to be the '80s in an urban area; probably the area of New Jersey where Bell and its various cognates resided for some time.

One barrier I see is finding the right data source. Given that I know DC has an open dataset (a bunch of them for people interested: opendata.dc.gov) for trash removal, there'd be a more convenient setting in DC. The New Jersey data I've found in brief searching doesn't quite fit what I'm looking for.

dluman commented 7 years ago

For those interested, there is quite detailed telco switch data is at: telcodata.us

dluman commented 7 years ago

Because I don't know exactly what I'm doing, I've started a bit early, but I have a repo set up to document progress. I am still trying to figure out how to cobble in interesting narrative text, even if not directly "narrative" in the canonical sense. In any event: the repo

dluman commented 7 years ago

Breakthrough (I think)! I got to thinking about the concept of "trashing" even further. So, while I'm pulling as many of the original technical manuals and documents for the switch tech in the various switching stations my narrator travels to, I'm also going go fill the dumpster/trash with other relevant documents (like fantasy novels, et al.) and pull pages out of each of them. So, instead of the consciousness controlling the narrative, it's like a scrounger scrapbooking what he finds.

Additional layers of complication: weather reports (because I'm setting it during the month of November 198x-ish), political/cultural events, random gripes about work (because the narrator will have some sort of job)—all in addition to the planned layer of the actual trashing trips.

I'm working on getting the PDF-puller ready to go (likely using pdfrw or a similar project).

This has morphed more into an adventure journal than a travelogue or other narrative form. I think it's somewhat influenced by reading Don DeLillo's Underword (I'm about 1/2 way through it right now), a novel which uses the waste removal industry as one of many backdrops. So, consider that an influence on my thinking.

tra38 commented 7 years ago

I will say that I am currently hyped up for this project. I shouldn't be hyped up for any project, because hype is the first step on the road to disappointment...and hype is way too common in AI research. But the amount of content that you are planning to add might provide enough variation to keep a person interested in reading 50,000 words. The biggest problem with computer-generated stuff is the feeling of repetition, but this can be alleviated by collecting more text and variations by which to continually surprise the reader. If you are able to have enough stamina to collect enough words (both the documents and the author's own personal journey and social commentary personal gripes), I think this project could succeed at producing a "human-readable novel".

Emphasis on If. After all, corpus assembly is a pretty manual process in and of itself, and you don't have to produce a human-readable novel if you don't want to. I just see a lot of potential in this project and wish you the best of luck.

dluman commented 7 years ago

Thanks, @tra38—I see a lot of potential, too, if I can do what you've articulated so well. It's going to rely on my document collection mostly, so I need to fill the "dumpsters" with interesting things—primarily facsimiles of documents/newspapers/et al.; thank goodness I have access to an academic library and quite a few newspaper archives thereby!

dluman commented 7 years ago

A day late, several dollars short (likely) it's done! Well, a first draft anyway:

https://drive.google.com/open?id=0B4cYh0dWc005aHVXNDc0SGQxZk0

hugovk commented 7 years ago

Repo link: https://github.com/dluman/NaNoGenMo-2016

I like the mix of newspapers and tech documents.

dluman commented 7 years ago

@hugovk Thanks for adding the repo; apparently early-morning self nor entire November self didn't think that was a good idea.

There's so much more to do than I had time/brains for, but I subscribe to the "done is better than perfect" school on this one!

tra38 commented 7 years ago

I did "consume" 50,000 words of content before getting bored and skipping to the end. However, I'm not sure if that was really consumption, as I merely skimmed the text instead of actually fully reading it. I was interested in some of the news article's headlines, some of the technical manuals, some of the "workplace gripes", but not in all of the generated text. I just zoned out of the text that I didn't care about, and only read the text that I cared about. (This is a problem I noticed when reading human-generated novels too, though. Sometimes I would skip whole passages due to boredom and move onto the "more exciting" sections...and other times, I would re-read passages to realize what it was talking about).

The subject matter was interesting to me, as I knew absolutely nothing about the concept of phreaking, and so having the plot be revealed through "workplace gripes" made me fill in the details...making me pretend that this guy is actually part of a secret underground cabal of evil, conspiratorial hackers, instead of a weirdo who just dives into dumpsters every day for fun.

Was this novel a failure? It was not "human-readable". But it wasn't repetitive either, and meets all the criteria that I hoped that this novel can meet. I am upset at myself for getting "overhyped", except that, no, this technology meets the hype...it's just wasn't enough. I think something similar to the "AI Effect" is at play here, that the the goal-posts for "human-readable novels" are shifting. Three possible reasons:

  1. Maybe, humans are subconsciously trying to portray ourselves as superior to the machine, and will dismiss the machine's work regardless of whether it's actually good. We can minimize our disgust, but not eliminate it. In which case, accepting the disgust, and producing work that is "good enough" is probably the best case.

  2. Maybe, programmers underestimated how difficult it is for humans to write human-readable novels. The tech has advanced far enough to produce novels that are at the level of first-time novelists...decent enough for quick reading, but not really something deep or enlightening. The focus now should be on refining the technology to produce something "good".

  3. Maybe, the bar of "readability" is set too high. Many ebooks aren't fully completed, and all these ebooks were written by humans. It's possible that making sure every last word is consumed by a reader may be too high a bar even for human beings to clear. And if so, should machines try to meet that bar? If the novel has the potential to be read (as your work certainly does), is that good enough?

EDIT: Additional comments - The "workplace gripes" can sometimes feel incoherent, though the use of similar sentences limited this incoherence. The simulation didn't really add much to the final output, though I know this may be due to lack of time. The simulated text did help to add some coherency to the end of the generated diary entry, making it a little easier to attempt to read the "workplace gripes".

dluman commented 7 years ago

I agree with quite a few of your points, @tra38—a lot of the truncated ambition results in the time/effort deficit that I think I encountered. However, this is something I'm interested in taking forward as the "data" didn't seem to leak through too much and using data to make chance-based choice is interesting to me.

Overall, this was more an exercise in figuring out what computational methods make possible and how the "form" can expand, which seems to be an interesting question to me. As such, I think that my inner, conceptual core shows through in the process and product. But, I look forward to a year of research and an intense effort next November.