Closed ebeshero closed 11 months ago
I would like to start with how incredibly interesting the voyant tool made a simple article look. In all seriousness, this was fun to look at and explore. The ngram sizes give me frequency counts above 5 were every number below five, what I mean is when I put the ngram on 4 it gives me freq. like 10 and 12. What's more interesting is when I put the ngram to size 3 it gives me a freq. of 109 at the highest, the phrase was I could not. I couldn't get much of the story, when I was reading I was trying to figure out what time does the story takes place. The way characters interacted you could tell they were children. Often they would say I could not, I can't, I don't, I am not, etc. as if they are defending themselves.
I chose the text jcb.txt:
An n-gram size of 5 has one phrase with more than six uses, and that's "as well as i could," but with a size of 4, the highest is 12 with "in the course of."
"I could not" is repeated 109 times, and the next highest after that is "i did not" with 60
"Using KWIC on a few of the n-grams, I think the text is kind of dreary or at least solum a lot of the context of "as well as I could" and "I could not" seems to be in that tone. The most common context to the "I could not" is the character discussing the problems they have to "bear" or endure.
I didn't find any references to violence, which leads me to believe this is more of a drama than I thought it might be at first, which was some fantasy adventure. I also respect the lack of repetition. Once you get past three-word phrases, repetition is very low, with the most repeated four-word phrase being "in the course of" 12 times. This text has 187,462 total words, and the most repeated one is Mr which I find funny because it demonstrates how much talking is done.
Post your screenshots and findings about jcb.txt here!