newtfire / introDH-Hub

shared repo for DIGIT 100: Introduction to Digital Humanities class at Penn State Erie, The Behrend College
https://newtfire.github.io/introDH-Hub/
Creative Commons Zero v1.0 Universal
8 stars 4 forks source link

Mystery text discussion of fc.txt #63

Closed ebeshero closed 1 year ago

ebeshero commented 1 year ago

Post your screenshots and discuss your findings about fc.txt here!

tylerakam commented 1 year ago

In my exploration of the mystery text, I discovered a few interesting details that have helped me gain suspicions of what the text is. I have not found any discernment between fiction and nonfiction, but I have found that through different Ngrams, I have suspected that this text takes place near water in some way. From there on out, I tried keywords in the KWIC, such as "sea" and "ship." From the information provided there, I was able to determine that the setting is on a ship at some point (I'm not fully convinced it is just yet). From the context of the KWIC, I was able to see some words such as "monster." This assists in determining if the text is fiction or not (it definitely makes me think it is fiction, but the possibility of it being an older piece of nonfiction is still there). When I plugged in "monster" I happened to find some very revealing information about the text. I saw references to a classic story that everyone knows. From here on out, I won't reveal what it is (because I am the first to post I believe), but I am very certain I know what story I am dealing with. It is completely fiction and is a good story to use as Halloween approaches. image image Overall, I found that by using the tools of the antconc, I was able to discover different details about the text through different ways. By using Ngrams I was able to find out what words appeared with others more frequently, leading me to try out different KWIC prompts and then try more Ngrams and so on. Seeing the frequencies of words was very helpful in determining important ideas and phrases. I'm fairly confident in my guesses on what the text actually is, so I hope to not be incorrect. I believe my understanding of the program has increased, and that I can utilize its tools in the future.

ZSchleger commented 1 year ago

When putting the mystery text into Voyant it found that the most common words in the text were "man", "life", "father", "time", and "shall". When I put the mystery text into AntConc and set the N-Gram to 3 the most common phrase was "the old man" which had appeared 29 times. While looking at the KWIC I found that the old man's name is De Lacy and there is also a man named Felix. I then decreased the N-Gram to 2 and one of the most common phrases was "of my" when I looked at the KWIC I noticed the word "death", in Voyant it says "death" was used 80 times in the text and it was used the most in the 8th segment of the document. After looking at the KWIC I saw the names Victor and Elizabeth, with these names I believe I know what the mystery text is. mtfc1 mtfc2 mtfc3

justnobl commented 1 year ago

Using this mystery text I noticed a few things. There is a few certain phrases that are repeated a lot more that you would expect if you just read the text. Ngram size 4 showed me there is a huge jump from how many times a certain word is used in a text rather than a whole phrase. With ngram size 4, there was phrases in the double digits and also where the frequency broke 5. The most frequent "phrase" I found was "with you on..." it was used many times in different ways while basically repeating the previous way it was used. This really showed me how we might overlook how frequently certain things are used in texts.

antcoihadbeen withyouonyourant
UAEFasool commented 1 year ago

Using AntConc, the general idea I got of fc.text was that it was a pessimistic and gloomy story about the author recalling the murder of someone close to him.

The first thing I did on AntConc was adjust the N-Gram size to 3. This allowed me to see the most commonly-repeated three word phrases in the text. That included phrases like, "i did not", "i could not", "the old man", "that i had", "but i was", "that i was", "i do not". Interestingly, these phrases are all past-tense, so I was able to understand the text was reflecting on a certain past event that the author was not fond of. In particular, the author uses phrases like "i did not" which repeated 34 times, "i could not" which repeated 32 times, and "i do not" which repeated 20 times. These phrases symbolize regret and connote to the ability of not being able to accomplish something; hence this hints that the author faced a conflict or challenge.

Screen Shot 2022-11-02 at 1 07 44 AM

To understand the story more, I changed the N-Gram size to 7. The results were less phrases and more sentences. There was less frequency but it shed a light on some of the events of the story. For instance, I now knew the story took place in Chamounix, Switzerland. But more importantly, I now understood that there was a wedding and a murder.

Screen Shot 2022-11-02 at 1 32 29 AM

AntConc is a significant tool that helped me get an overview of the text through analysis and reading patterns. I will use it on my project to compare two pieces of text together and will publish it on my website.