Computational-Content-Analysis-2020 / Readings-Responses

Repository for organising "exemplary" readings, and posting reponses.
6 stars 1 forks source link

Discovering Higher-Level Patterns - DeDeo et al 2018 #29

Open jamesallenevans opened 4 years ago

jamesallenevans commented 4 years ago

Barron, Alexander TJ, Jenny Huang, Rebecca L. Spang, and Simon DeDeo. 2018. “Individuals, institutions, and innovation in the debates of the French Revolution.” Proceedings of the National Academy of Sciences 115(18): 4607-4612.

katykoenig commented 4 years ago

This paper uses Kullback-Leibler Divergence (KLD) to measure surprise in language pattern. I have seen other applications of this Bayesian surprise, e.g. surprise maps, but I was wondering if other methods of divergence could measure surprise?

laurenjli commented 4 years ago

The authors don't discuss evaluation metrics like accuracy for the results of their study. What are metrics that can or should be when evaluating topic modeling?

lkcao commented 4 years ago

Although I try to read every line, I find myself still a little confused about the method of this piece. How does the author combine topic modelling and KL divergence? What is the input of KL divergence in this piece? Would really appreciate it if anyone can comment on my post and answer this question :-)

wunicoleshuhui commented 4 years ago

I also equally confused about the exact metrics of measurement as the comments above. Specifically I am very confused about surprise is measured through KL. Are the sampled speech pattern data sufficient enough for measuring transience and novelty?

skanthan95 commented 4 years ago

In this work, the authors found that NCA members on the left were more likely to generate new, "innovative" speech patterns, while members on the right tended to preserve familiar word patterns--both groups demonstrating meaningful differences in how they transmitted patterns of language use. Notably, members who were charismatic were more likely to innovate and generate phrases that would catch on. However, I don't fully understand how they accounted for these individual effects:

("The data similarly show individual orators departing from system-wide trends. Of the top 40 speakers in the assembly, 27 show significant deviations from aggregate patterns in either novelty or resonance at at least the p <0:05 level, with 22 speakers showing deviations at p..")

Additionally: can the computational methodologies the authors used be used to map the linguistic innovations that contributed to evolution of romance languages (into distinct tongues?)

sunying2018 commented 4 years ago

In the part of materials and methods in this article, it mentions the model they used is Latent Dirichlet Allocation(LDA), and as we know LDA is an unsupervised model. But this article does not talk about the evaluation of this model in a detailed way, so I am wondering how we evaluate the performance of this model?

luxin-tian commented 4 years ago

The appendix of this paper documents in detail about the technical process of the analysis. But I wonder how can the measurement of novelty, transience, and resonance be evaluated?

jsmono commented 4 years ago

besides the confusion of the method as many pointed out, I am also curious why the authors divided their analysis to two epoches, why it is beneficial to their findings and how did they decide the time period for the datasest?

rachel-ker commented 4 years ago

Although I try to read every line, I find myself still a little confused about the method of this piece. How does the author combine topic modelling and KL divergence? What is the input of KL divergence in this piece? Would really appreciate it if anyone can comment on my post and answer this question :-)

The authors mentioned in the materials and methods section at the end of the paper that they performed LDA to decompose the speeches into topics and then track them using KL divergence to measure novelty and transience.

YanjieZhou commented 4 years ago

I think for the part of evaluating metrics, this paper gives few details about it and regarding the interpretation of novelty and resonance in a quantitative way, I also fail to find a clear answer to it, about which I am curious and hope to find out more details about it.

heathercchen commented 4 years ago

Like @laurenjli mentioned, I am also wondering about the validity of the method the author chooses and the accuracy of it. Why he chooses KLD rather than other possible ways to measure closeness and divergence?

ziwnchen commented 4 years ago

I found that most of the comments above are about the methods/metrics being used. Apart from that, I am also curious to know the general idea of this paper. It turns out that the central question of this paper is about "How democracies make decisions?". Does this paper give a valid answer to this question? Are decisions the same thing as word patterns? Could those patterns we found in the french revolution be generalized to modern democracy?

alakira commented 4 years ago

Maybe this is not the author's intention, but can't we combine their results with the actual legislation passed by Congress to see how influential each speech is?

HaoxuanXu commented 4 years ago

It's interesting to evaluate the creation and management of innovation in French Revolution setting. It would interesting to see if there are other metrics that can complement this NLP approach

acmelamed commented 4 years ago

There is a fundamental contradiction in the premise of this study in their framing of the NCA as a hermetic system which effected power purely through linguistic rhetoric. The authors say in their introduction that they are interested in the question of how the ideas expressed in the NCA entered into that system, but this question is not satisfactorily explored, leaving the results open to the critique that they place insufficient emphasis on the decentralized production of power enacted from without upon the members of parliament.

kdaej commented 4 years ago

This study takes both qualitative and quantitive approaches to study the pattern of the debates on the French revolution. With the development in computation, it seemed to me that computational content analysis enables us to explore what we had not been able to with higher external validity and objectivity. However, the qualitative approach to the data and human interpretation of the results seem inevitable in the study. If computational analysis still requires human involvement, can it achieve objectivity or is it simply meaningless to ask this question?

cytwill commented 4 years ago

I am quite interested in the authors' definition of novelty, transience, and resonance. Though these measurements have their reference from information theory, I wonder if they are commonly accepted in social science? When we do our own projects, is it a good choice for us to invent our own measurements for something conceptual? Or should we just use those metrics already widely used?

Lizfeng commented 4 years ago

This paper is placed under the background of the French Revolution. It asks the research question: how does the revolution's first parliament - National Constituent Assembly (NCA) create the innovative model of future democracy? The author analyzes the speech patterns of the assembly members to find out this innovation process. It categorizes new ideas in speeches by novelty and transience. The paper uses topic modeling and KLD to measure the above two categories. The most important finding in this research is innovation bias. Namely, novel speeches were unexpectedly influential.