newtfire / introDH-Hub

shared repo for DIGIT 100: Introduction to Digital Humanities class at Penn State Erie, The Behrend College
https://newtfire.github.io/introDH-Hub/
Creative Commons Zero v1.0 Universal
8 stars 4 forks source link

Workshopping views from Voyant / Antconc #14

Closed ebeshero closed 3 years ago

ebeshero commented 3 years ago

Post some screen captures of your findings playing around with Antconc and/or Voyant Tools with combinations of texts available on our Corpus Analysis Assignment! As you post, tell us a little about what you think is interesting!

arrowarchive commented 3 years ago

@ebeshero, I used AntConc to compare Frankenstein and Hamlet, and I noticed something very similar between the two: the phrase "in the" is in the top three for each one, but after analyzing it further, the phrases are used differently in each story (see below: Frankenstein is on the left, Hamlet is on the right)

compare1 compare2

In Frankenstein, the term is used for dramatic emphasis. A lot of adjectives follow the phrase to make the scene more dramatic (and having read Frankenstein myself, the story is very wordy). In Hamlet, the phrase is used for exposition. For the majority of the list, the phrase is used to describe the setting or to draw comparisons from one thing to another.

NOTE: I have a dual-monitor setup, so I cannot crop out the black in the second screen. If there is a way for me to take a photo of a single screen without having to crop out the other, please let me know!

Meganpeck115 commented 3 years ago

I found AntConc was a tad easier to use. I used two different articles. One was an Adobe photoshop article (top) and the other was a Adobe lightroom article (bottom). I saw that the ngrams in the first s=article contained a lot of hyperlinks until I began to scroll down and I could not quite understand why as my second one did not. The second article contained the words "red, green, blue, red" frequently just like the first one as well.

image

image

tomsheehy commented 3 years ago

I used Antconc to compare Dracula and Macbeth. I quickly discovered that Antconc is very user friendly and efficient for comparing articles. During my comparison, I learned that Macbeth heavily used "Lady Macbeth", making it the most frequent used pair of words in the article with 71 times. Although her name may be used in stage settings, it shows how important Lady Macbeth's role is in this tragedy. This was still surprising to me because in other articles, phrases with the word "the" or "a", are typically the most frequently used. For example, in Dracula, the most used phrase was "of the" being used 891 times. ANTCONC_Dracula ANTCONC_Macbeth

dxh405 commented 3 years ago

I chose Dracula and Frankenstein as my two text files. I figured since they are both about "monsters" in a way there might be some interesting similarities between the two. Immediately once putting both through AntConc I discovered both had the phrase "of the" as the most frequent phrase. of the

I then decided to look at the word monster in both of the books. It appears that in Frankenstein, the word monster was used to describe someone or something more often than in Dracula. In Dracula, monster was used to describe something inanimate more often than a person, for example "monstrous waves". That does make sense as Frankenstein is literally a "monster" in a sense, where vampires are vampires, not necessarily monsters.
monster

bpm5520 commented 3 years ago

I compared Dracula and Pride and Prejudice since they were both 19th century novels. With a limit of 2 word long n-grams, both had "of the" show up as the most in both works. As well, within the top 6 on both of them, it shows what I assume to be main characters of both, as "Van Helsing" and "Mr Darcy" show up hundreds of times in each respective work. I also compared them by 5 word long n-grams, which showed a key feature to Dracula. Each chapter of Dracula is set by a journal or diary entry, shown by the journal and diary entry n-grams being the top of the charts. Meanwhile in Pride and Prejudice, there seemed to be no real correlation to the top entries, mainly more common phrases and a handful of characters involved.

Comparison1 Comparison2

amw6765 commented 3 years ago
MarlowFnT_NGram

After comparing the two different story .txt files of Marlow in AntConc, I was able to see some interesting similarities and differences. The N-Grams between the two seem to be mainly different from one another. Fatstus seems to have more clustering of words that deal with good and evil. As well as a lot of talk about good and bad angels. Where, Tamburlaine appears to have more clusterings of god and kings. There was one apparent N-Gram that both stories do share using the minimum of 3 words and maximum of 6 (ex. 'all the world' is an N-Gram used in both) I am quite certain that if I mess around with the constraints even more, I will be able to see more similarities between the two.

AlexanderRAnderson commented 3 years ago

I used AntConc to compare Shakespeare's works, Hamlet and Macbeth. I was interested when I saw the "words", "s" and "d." Turns out they are just the contractions he used. I still thought I could find something interesting by looking at these contractions. I found that Shakespeare used contractions more frequently in Macbeth than in Hamlet. Macbeth was written in 1606 while Hamlet was written in 1609. I found this interesting because Shakespeare seems to use less contractions in the newer work. This may imply that he stopped using contractions as often later in his career, but I cannot be sure until I look at more of his works. Comparison

kzp308 commented 3 years ago

I found that AntConc was easier to use for this type of comparative exercise. I used Dracula and Frankenstein to be compared for this exercise because I think it's in the spirit of October to do scarier works. I let there be a two word minimum and maximum. When I put Dracula into AntConc, I found that the most frequently used words were "of the". It was used 863 times in the text. When I put Frankenstein into AntConc; coincidentally, I found that the two words most frequently used together were "of the". I thought this was really interesting to learn about. Even though the words don't seem that important, they are. Screenshot (36) Screenshot (35)

argynarg commented 3 years ago

I examined Macbeth vs Hamlet to see if there were any differences in word usage between the two plays and found them to be more similar than I expected. The most frequen N-grams are pretty much the same in both and are fairly common sets of words. I was hoping for something a little more interesting considering the works being analyzed, perhaps increasing the maximum amount of words in each would create more interesting results. image image

nxh5137 commented 3 years ago

I compared Frankenstein and Pride & Prejudice in AntConc to see if there were any similarities to the two stories, looking at how frequent the words were used and saw similarities between the two. I had the N-Gram Size at a min. of 2 and a Max. of 4. 'of the' is the most frequent of the words in many sentences as you can see in the pictures below frequently used 990 times. Screenshot (17) Screenshot (18)

When I looked deeper to find any other words that were not the same generic words in a sequence that anyone can find in any kind of writing out, I stumbled upon a character that appears in both stories. 'Elizabeth was' appears in both stories that have nothing to do with each other. Most likely a coincidence since there is a big age gap between the two making the chance of them seeing each other incredibly slim, but Shelley might've gotten the name from Austen. Screenshot (21)

Screenshot (23)

NickyV1234 commented 3 years ago

The two texts that I analyzed were Charlotte Brontë, Jane Eyre and Emily Brontë, Wuthering Heights. one of the key things that I had realized was that one of the n grams that showed up was phrases that used bonnet. even though bonnet wasnt the most used word there were the most different types of phrases that it was used in. 20201009_150215 20201009_150222 20201009_150236 1

jzm6677 commented 3 years ago

I used AntConc to look at Macbeth and Pride and Prejudice a n gram i saw often was in the. 2

am0eba-byte commented 3 years ago

Here's my tardy entry: I decided to clean up and use Alice in Wonderland by Lewis Carrol and Dante's Inferno to compare through Antconc and Voyant. I know, they couldn't be more different pieces of literature: one's an Italian poem-epic(I used an English translation) and one is an English childrens' novel. I thought it would be interesting to explore these two because their stories draw some interesting parallels - both of the main characters are going on a journey through fantastical realms (Hell and Wonderland), and both of them meet other characters within these realms that either help or hinder their journey. Here are the screen captures I gathered from them on Antconc:

Alice in Wonderland 3-gram

alice_3gram_antconc

Alice in Wonderland 4-gram

alice_4gram_antconc

Dante's Inferno 3-gram

inferno_3gram_antconc

Dante's Inferno 4-gram

inferno_4gram_antconc

Comparing the 3-grams, I thought it was interesting how little Dante's Inferno has word frequencies containing "he said," or "I said," or "[character] said" as opposed to Alice in Wonderland. I think this reflects how often Alice's crazy character creatures verbally interacted with her, while in Inferno, Dante's demonic and hell-dwelling encounters were a bit less verbal - and Dante had a much different way of expressing character interactions, as I could sort of deduce from the 4-gram. Instead of saying "'yada yada,' said Virgil" he writes "'yada yada,' and he to me 'yada yada yada'" Maybe because it's originally in Italian, and it's super old Italian, so even stranger.

Here's the Voyant Word Clouds:

Alice in Wonderland

alicewonderlandCloud

Dante's Inferno

danteinfernoCloud

This was pretty nifty - both of their most frequently used word was "said," which strangely isn't obvious in Inferno's n-grams. Also, you can definitely date Inferno by looking at this word cloud and noticing the frequent prevelence of "doth," "unto," and "shalt".

gabbiedoster commented 3 years ago

I decided to compare Shakespeare's Romeo and Juliet (left) to Jane Austen's Pride and Prejudice (right). In doing so, I found the n-grams of both the play and novel. In Shakespeare's Romeo and Juliet most of the n-gram terms are individualized compared to the n-grams in Pride and Prejudice. One n-gram that's used frequently in both texts is "i am". Another discovery I made was most n-grams in the text of Pride and Prejudice are in past tense or referring whereas in Romeo and Juliet n-grams are present tense. This easily may be the explanation of the tenses the book and play were originally written in.

Screen Shot 2020-10-14 at 2 33 32 PM