NaNoGenMo / 2018

National Novel Generation Month, 2018 edition.
https://nanogenmo.github.io/
112 stars 6 forks source link

IS_IT_LOVE #107

Open rchrdlln opened 5 years ago

rchrdlln commented 5 years ago

Hey, I forgot to officially sign up for this but I did generate a "novel" in November and I'd like to post the source code and finished work etc. - entitled "IS IT LOVE" (or so it titled itself) today if possible.

rchrdlln commented 5 years ago

Repository: https://github.com/rchrdlln/nanogenmo

Novel: https://github.com/rchrdlln/nanogenmo/blob/master/Microsoft%20Word%20-%20IS_IT_LOVE.docx.pdf

Google Books search results pages usually include a sentence or two on either side of the complete sentence containing the search term. I chose some terms I hoped might add up to an interesting non-narrative and manually downloaded 5 or 10 search results pages per search term into separate folders. First script scrapes the pages to a huge text file. Second script goes through it all, pulls out what appear to be viable sentences and clauses, then reassembles them and cleans the text up a little bit, but probably not nearly enough. I tried to avoid sucking in book blurbs as source material, without much luck.

No AI, no ML, no NLP. 68,421 words.