Closed JonathanReeve closed 7 years ago
An MDW analysis is in https://github.com/lit-mod-viz/middlemarch-critical-histories/blob/master/e3/e3-journal-MDWs.ipynb (although I might rename this notebook at some point). The most distinctive word of quotations from GE-GHLS ("specialists") is "Casaubon," and for all other journals ("non-specialists"), it's "Dorothea." This (unsurprisingly) matches with the specialist index chapter graph, and associated major characters.
Specialists are very slightly more likely to quote adjectives and plural nouns, while specialists are slightly more likely to quote determiners and particles.
Do you mean keywords? I'm in favour of this!
Would just mean splitting the corpus of all quotations into two bags o' words, then looking for the most divergent frequencies per 100,000 words.