dariusk / NaNoGenMo-2015

National Novel Generation Month, 2015 edition.
340 stars 21 forks source link

The Tale of the Github Repository #162

Open MichaelPaulukonis opened 8 years ago

MichaelPaulukonis commented 8 years ago

Github repositories are full of events. Can we frame these events as a story?

OF COURSE WE CAN!!!

Can we make this story interesting?

Let's avoid answering that, shall we?

Code and notes are here: https://github.com/MichaelPaulukonis/NaNoGenMo2015/tree/master/github-narrative

I have a small proof-of-concept gist here.

Extracts:

On Sun Oct 25 2015 19:27:52, dariusk opened a new issue called "Resources". But it's an admin issue, so who cares? On Sun Oct 25 2015 19:28:38, dariusk opened a new issue called "Procedural Visual Novel". On Sun Oct 25 2015 19:55:08, bcj opened a new issue called "An Attempt to Exhaust Memory to Simulate a Place to Attempt to Exhaust". On Sun Oct 25 2015 22:41:47, zachwhalen opened a new issue called "Using a password dump for a corpus". On Wed Oct 28 2015 16:18:43, tra38 opened a new issue called "The Atheists Who Believe In God". And it's been completed. Sweet!

The plan is to dig deeper -- getting time-frames for when labels were placed (so we know what was completed first, etc), interesting events (opened a 2nd issue, completed a second project, etc). If the word count is absolutely abominable -- I also have the entire body of comments to play with.

My original thought was to tell the "story" of my past year on GitHub -- but only the last 300 events/90-days of events are available via the API, so I switched tracks.

MichaelPaulukonis commented 8 years ago

It has not escaped my attention that this is also the story of NaNoGenMo-2015.

MichaelPaulukonis commented 8 years ago

@tra38 - you might find this interesting. You and @hugovk were some of the (in)direct inspirations for this.

tra38 commented 8 years ago

Yeah, I do find this interesting. I once had the idea of using information from GitHub to write a narrative, but I couldn't think of a good way of doing it. Will be monitoring progress.

MichaelPaulukonis commented 8 years ago

It needs work, but it has nearly 100K words: https://gist.github.com/MichaelPaulukonis/1b95c25f7eca4942933a

Of course, that's because it's including ALL THE COMMENTS. ugh.

A future elaboration will have a summary of the comments, if they are > a certain number of characters.

Another elaboration is to try to suss out the target repo, and extract the languages used. We'll see how this goes.

Of some interest may be the ranking algorithm, which ranks issues. It's more of a proof-of-concept, but a start.

if (iss.comments) { atom.rank += iss.comments.length; }

// rank by labels
if (iss.labelTypes.preview) { atom.rank += 5; }
if (iss.labelTypes.completed) { atom.rank += 20; }
if (iss.labelTypes.closed) { atom.rank += -1; } // hrm...
// admin issues aren't as interesting....
if (!iss.labelTypes.admin) { atom.rank += 5; }

a better comment-ranking algorithm would give more weight to non-author comments, for example. Also, cross-references should have some weight, time-to-completion, how early opened, notable firsts, etc.

As of 2015.11.30, 10:39pm EST:

The top 20 issues, ranked:

Compiler pipeline + writers' techniques = a "proper novel" ::blink:: was opened by cpressey.

Simulationist Fantasy Novel was opened by mattfister.

"Where I'm From" poem & novel generator was opened by marythought.

A daring journey to the bottom of the pit was opened by mcwill97.

It takes a "Village" to translate "Hamlet" was opened by dkurth.

The Atheists Who Believe In God was opened by tra38.

Co-authored Procedural Novel was opened by dariusk.

Cheating pseudo-entry: Vocabulary mashup was opened by mewo2.

Generative Socratic Dialogues was opened by yourpalal.

The Null Earth Catalog was opened by coleww.

intense intents in tents was opened by emdaniels.

Browne Garden Commonplace Book was opened by spenteco.

Interlude: (Un)Sound Structures was opened by jseakle.

The TPP: A "Found" Generated Novel was opened by coleww.

Saga III: Another Original Play by a Computer was opened by lizadaly.

Goal-driven use of scenes and sequels for capers was opened by enkiv2.

Molly's Feed was opened by moonmilk.

(NaNoGenMo: Dadism 2.0) 2.0 was opened by tra38.

A play based on the "french 4chan" was opened by WhiteFangs.

Neuralgae was opened by spikelynch.

MichaelPaulukonis commented 8 years ago

@hugovk - it hits 50K, stick a fork in it and call it done.

hugovk commented 8 years ago

@MichaelPaulukonis Splat.

I like that's it's also pulled in the images.

Another elaboration is to try to suss out the target repo, and extract the languages used.

https://developer.github.com/v3/repos/#list-languages

MichaelPaulukonis commented 8 years ago

Thank you, sir!

Yup, but each issue has the repo linked in a different way -- so it'll be a combination of parsing, plus a manual whitelist for those that never indicated it, or that link to multiple repos (say, when just commenting on somebody else's project).

But THEN I'll be able to complete the Language survey for each year!

...finally.