Classifying Meanings & Documents - So & Long 2015

Computational-Content-Analysis-2020 / Readings-Responses

Repository for organising "exemplary" readings, and posting reponses.

6 stars 1 forks source link

Classifying Meanings & Documents - So & Long 2015 #9

Open jamesallenevans opened 4 years ago

jamesallenevans commented 4 years ago

Post your questions here (as comments) for:

So, Richard and Hoyt Long. 2015. “Literary Pattern Recognition: Modernism between Close Reading and Machine Learning.” Critical Inquiry 42(2): 235-267.

lkcao commented 4 years ago

This paper is really interesting, as it applies machine learning in one of the disciplines with most inner ambiguities and complexities. My questions are as follow: (1) How should we choose algorithms in our research? This paper use naive Bayes, which produces good results, but the author did not provide further information about why they chose this method instead of other ones. (2) Is (bags of words) proper input for studies about poem? Sometimes I guess it is not the words, but the sequence of words, that makes something poem. I guess "Jack eats a rabbit" is like a plain statement while "A rabbit eats Jack" is more like a poem....(And when haikus and 300-characters scientific reports about natural disasters were mixed up, the current algorithm might fail?)

bjcliang-uchi commented 4 years ago

In their approach, "I" is included in the deleted stopwords (P255), but isn't it that the Haiku has a strong tendency to avoid using "I" (as discussed at the very beginning of the paper) and therefore it should be a variable that matters?
I really like they conduct certain literacy analysis before applying statistical methods. But as @clk16 mentions, intuitively, the sequence of words (or at least, n-grams) is more important than single words (uni-grams). It does not seem that they actually integrate their rich understanding of the history and context of Haiku into their statistical application.

jsmono commented 4 years ago

Although I am presenting this article, one thing I couldn't figure out is how their approach can be used in other kinds of literature. For example, if I want to analyze longer poems like "paradise lost", their method may not be very effective since Haiku is a very special concise form of poems, rendering it easier to analyze and categorize. Nevertheless, I do think their method is applicable to short poems in many other languages that have a unique and well-known format.

kdaej commented 4 years ago

This article suggests that a machine learning algorithm can successfully identify haiku poems among other English poems. While there are some truths to it, I believe this successful outcome was due to the homogeneous characteristic of haiku poems. They follow a certain style with a restricted number of words to use and mostly describe seasons. Would this mean that the text classification, in this case, relies on the distinctive nature of text data more than the algorithm?

yaoxishi commented 4 years ago

It's very interesting to apply machine learning algorithms to analyze haiku poems as poems are a game of words that could vaguely also vividly expression of human emotion. I am curious of how the algorithms could understand the vague meaning of the words, sentence and how the holistic meaning could be understood.