Open jamesallenevans opened 4 years ago
I found this paper's finding of word-level positivity bias independent of frequency very intriguing. I would normally think of positive/negative sentiments associated with phrases or sentences given the context. Why did the researchers choose to study the most frequent words rather than phrases?
Furthermore, could the methodology used in this study be used to characterize sentiment of text that has a specialized purpose? The corpora used in the study is wide-ranging which helps with generalization as the study mentions the hedonometer measurements from Twitter correlate strongly with the Gallup index. However, there could be situations where the context of the text could change the most frequent words and the positivity/negativity might be different, such as in political speeches. It would be interesting to see this methodology used on a smaller scope.
@laurenjli I also found the sentiment analysis at word-level to be of interest. Sentiment analysis, even when trained on phrases and sentences can be unreliable. Taking away the context given with a larger text sample and looking at only words in insolation, made me wonder whether the authors could really claim that human language as a whole skewed positively.
I was also curious about the selection process for the languages used in the research. What were their criteria for selection and how did they account for regional differences within those languages?
A very interesting read on the positivity of words across various languages! It's intriguing to see that the use of words in human languages possess this universal positivity bias. I'm wondering if this is related to how our society advocates the power of positive communication -- that we should use positive words, post encouraging comments, and create a positive environment. It would be interesting to find out the answer!
@laurenjli @rkcatipon I agree that taking the words out of context is problematic:
The authors state that their major scientific finding is contingent on words being experienced "in isolation", but this framing does not reflect natural language's context-dependent flow. Additionally, the way in which they operationalized positivity doesn't account for sarcasm, or other nuances (like connotation or implication) that subvert the dictionary definition of a word.
Also, several studies on the universality of emotion (e.g. Joshanloo & Weijers, 2014 ) have suggested that cultures may differ in their beliefs on what constitutes happiness in the first place. Therefore, the words that people might choose to convey happiness could differ across cultures as well, making across-language comparison difficult.
@skanthan95 I agree with you that the way the authors make cross-language comparisons in positivity is problematic (Fig. 2). It seems to me that they can only claim one language is likely experienced as more positive than another by their respective native speakers, not that one is "objectively" more positive than another.
Regarding your first point, I also wonder why they chose individual words as the unit of analysis. Why didn't they choose another unit, such as paragraph or sentence, and use some generic-use sentiment analysis algorithms to classify them? These algorithms can be more or less domain-specific but it seems they can at least capture some of the context-dependence.
I personally have found this sentiment analysis approach very useful when analyzing large sets of textual data, while I do also have concerns about the intricacies of contexts that affect readers' interpretation of these data. Furthermore, applying this approach to trajectories of narratives of novels might also be problematic in details, as even when most words in a paragraph have certain connotative association, if the readers' attention is focused on a few specific words with other associations in that paragraph and thus focus on this type of sentiment, that will not be captured accurately by sentiment analysis. Are there potentially ways to take into account how much a reader focuses on certain words compared to others, and thus modifying the algorithms to compute the sentiment of a text?
I also have a question about how the authors selected the resources, their samples, for analysis, that in some languages, there are more diverse resources of corpora, covering both twitter, book, movies; while others don't. Have they taken the effect of these different resources into account? and might this different resources across languages bias the findings?
I am also wondering why the authors view frequency of words as an significant confounding variable in predicting positivity bias and how it might impact the results?
The analysis used in this paper relies on word frequency to capture the distributions of the average happiness scores. However, it was not clear how they resolved the issue with negative sentences. When somebody says "not good" instead of "bad", the overall meaning is negative. However, if they simply count the frequency of negative and positive words, "not good" is one negative word and one positive word instead of one negative phrase. Consequently, if they did not account for this, the positivist wording might have been due to the use of negative sentences.
Like others have noted, I'm skeptical of content versus context here. In the conclusion, the authors say that
Our major scientific finding is that when experienced in isolation and weighted properly according to use, words, which are the atoms of human language, present an emotional spectrum with a universal, self-similar positive bias."
The emphasized part seems to imply proper accounting for context (perhaps using n-grams or something of the like) but then the methods section says
We then paid native speakers to rate how they felt in response to individual words on a nine-point scale, with 1 corresponding to most negative or saddest, 5 to neutral, and 9 to most positive or happiest
which is pure content analysis. Am I missing something? Where did they account for context?
I found this paper's finding of word-level positivity bias independent of frequency very intriguing. I would normally think of positive/negative sentiments associated with phrases or sentences given the context. Why did the researchers choose to study the most frequent words rather than phrases?
Furthermore, could the methodology used in this study be used to characterize sentiment of text that has a specialized purpose? The corpora used in the study is wide-ranging which helps with generalization as the study mentions the hedonometer measurements from Twitter correlate strongly with the Gallup index. However, there could be situations where the context of the text could change the most frequent words and the positivity/negativity might be different, such as in political speeches. It would be interesting to see this methodology used on a smaller scope.
My understanding is that sentiment is measured by recruited native speakers of that language and authors collect 5 million human assessments. My understanding is that large amount of assessment is used to counter individual human bias, but I suspect that this may limit the applicability of sentiment analysis in smaller datasets?
As many mentioned, the research itself is fascinating but when thinking from various perspectives, there are a lot of questions that raise skepticism related to the result. I am also wondering that if language families will affect the result considering the vast kinds of languages the research tried to cover.
Taking English and Spanish, both of which use characters without meanings to form words expressing meanings, as examples. Analyzing the positiveness, in this case, using isolated words may lead to an accurate result. But for languages like Chinese where each character has a series of meanings and different combinations may completely alter the positiveness of the word, the method the researchers applied may not be the best way to approach the question.
This paper is predicated on a flawed understanding of what language means in theoretical literature. Language as understood in modern linguistics, its natural place of study, is the generative mechanism that transforms lexical units(usually thought of words/units of semantics) to external language(which is colloquially called language; that which is spoken, written or available for introspection for a person). This distinction is important because studying externalized language (in this case text, which the authors do study) does not necessarily reveal anything about the "structure of language" as they claim when they say "Such a data-driven approach is crucial for both understanding the structure of language". That is an absurd claim, akin to saying the casual mechanisms of gravity can be understood by analyzing data of different objects dropped at various heights. The idea that casual mechanics(or generative processes in this case) can be understood through pure data analysis of the output of those mechanisms is a curious concept in modern, post-Galilean science.
I am particularly interested in the Chinese part because I am a Chinese native speaker. The Chinese forums and social media spaces are nowhere friendly and positive. Toxic comments are everywhere, if not reported and deleted already. The google book project is good, but not enough for Chinese. I would even imagine a negative result on word level in Chinese, regardless of whether to analyze on a word level instead of phrase or sentence level.
To be honest, I do not pretty much believe in the results of this research. Several puzzles are raised after reading this paper:
The concept of "positivity bias": I think positivity bias in language means that people tend to have a stronger feeling for positive words than negative words of equal intensity. However, here it seems to be a stronger tendency to use positive words. But if that is what the authors define positivity bias as, their results can not be used to textual sentiment analysis, because they can only illustrate that people less often use negative words but can not help with words' sentiment scores.
I have checked several Chinese words they listed through the average happiness graphs, which I do not think are currently everyday words of Chinese people. So I think Google Corpora might not be an ideal one.
Sentiment analysis is usually highly sensitive to the language context where words are used, which is absent in this research. So I doubt the external validity of the conclusions.
This measurement of happiness in language and the shape of story are pretty interesting! But I have a question about the native speaker's measurement of the happiness of the words. As words in different languages are scored by their own native speakers, suggesting the rating may be biased due to culture's different definition of positiveness. Should we introduce a more justified method to assess the happiness of words? If possible, what methods can we else apply to this measurement?
As others have noted in greater detail here, the paper's constructs seem highly flawed vis-a-vis their claims. The extrapolation of 'frequency' equating to 'importance' is lost on me, as is their definition of 'importance'. As Garcia et. al. note in a response letter to the journal, the corpora of the authors includes articles and prepositions which are in turn assigned a 'positivity' score, which is meaningless.
This paper, for me, stands out as a good example of why we need better language models that build on linguistics theory, and work to replicate the generative structure of language.
@arun-131293 puts it best when he says:
The idea that casual mechanics(or generative processes in this case) can be understood through pure data analysis of the output of those mechanisms is a curious concept in modern, post-Galilean science.
To regard the use of a single word as a tool for measuring emotions in corpora, from my perspective, is a relatively compromising but still reasonable choice to balance the frequencies and the correctness. It is surely acknowledged that in sentences the specific uses of words can indicate totally different meanings, but considering the great amount of data in corpora, it is still understandable to use words as a representation of our emotions, but I am doubting the reasonability of the procedure of translating words across different languages via Google Translate.
My question is, since their data consists of the most frequently used words, but words derive meaning from context, I’m not sure if just looking at words would give an accurate picture of the sentiment the words convey? Moreover, happiness and sadness might mean different things for people from different country. I wonder if they have accounted for when constructing the nine-point scale?
Dodds, Peter Sheriden et al. 2015. “Human language reveals a universal positivity bias.” Proceedings of the National Academy of Sciences 1112(8):2389–2394, doi: 10.1073/pnas.1411678112