Computational-Content-Analysis-2020 / Readings-Responses

Repository for organising "exemplary" readings, and posting reponses.
6 stars 1 forks source link

Discovering Higher-Level Patterns - Grimmer 2013 #27

Open jamesallenevans opened 4 years ago

jamesallenevans commented 4 years ago

Grimmer, Justin. 2013. “Appropriators not Position Takers: The Distorting Effects of Electoral Incentives on Congressional Representation.” American Journal of Political Science 57(3): 624-642.

tzkli commented 4 years ago

This paper exemplifies how unsupervised methods can discover patterns not easily discernible by human readers. I'm having some trouble understanding (intuitively) how multidimensional scaling algorithm (MDS) works. How should we go about choosing the number of dimensions? Is the choice determined primarily by theoretical priors? Or are there any computational/statistical caveats? The author does not elaborate on this in the paper.

bjcliang-uchi commented 4 years ago

I enjoy this paper so much and my final project is closely related to this method! But I still struggle with how the authors assign each press release to different categories. The stems words are so minimal and thus vague for topic detection. Also, how are these dimensions selected--particularly, what makes a regional issue regional?

deblnia commented 4 years ago

I think paper fits best with the idea of grounded theory when we take the case study about the Iraq War as the first thing that Grimmer encountered in formulating the idea of this paper. I also think that the use of press releases is well-suited to the claim he's trying to make -- it captures a nice distinction that isn't always available in textual analysis (what is written down isn't the entirety of what is thought).

In re: @tzkli's comment -- does he actually use an unsupervised method? I might be misunderstanding, but he explicitly says he and some research assistants hand-coded some press releases according to the typology Mayhew laid out in 1974, and then used supervised learning to extend those labels to the entire corpus. Does the higher-level pattern in this case come from supervised learning or unsupervised learning? Does it matter?

yirouf commented 4 years ago

I have similar thoughts about this paper. I think it is highly related to ground theory, and I agree with the opinion about how it is more plausible for methods like such to discover meaningful clusters that might not be detected by humans due to assumptions and prior knowledge. However, I do think to what extend this method is able to find such clusters is depended on the learning used later in the method to generalize labels to a greater extend. How might the supervised and unsupervised influence the outcome?

arun-131293 commented 4 years ago

The main conclusion of this paper is that "Across policy debates, the most conservative Republicans and most liberal Democrats articulate positions much more often than their more moderate colleagues" and the suggested explanation being related to "the effect of the electoral connection:...representatives have greater electoral incentive to participate in vitriolic exchanges (as) members of Congress now represent districts with a larger concentration of copartisans than 30 years ago". I wonder how this fits in with the work of Gilens and Page, 2014 which concludes that the average American voter does not in fact have say in legislation while organized interests (Chambers of commerce, even labor unions) do? In fact polarization has an electoral connection but the actual legislation passed doesn't, how does one account for the discrepancy?

acmelamed commented 4 years ago

The premise of establishing a quantifiable measure of "extremity" serves as the basis for the majority of the claims and data interpretations made by this study, but it remains unclear to me how exactly this standard was derived prior to the other tests, like representatives' positions on the Iraq War. In fact, it seems at times as if that variable is dependent on the data from such tests.

yaoxishi commented 4 years ago

While I got the main idea of the paper, I am still very confused by the methods the author used to classify the data, for example, how the algorithms categorize the patterns of the press into different dimensions. Also, the paper seems doesn't mention how they evaluate the validity of the algorithms they used.

sanittawan commented 4 years ago

Data collection-wise, I am wondering why he chose to analyze press releases from 2005 to 2007 (3 years) since not all senators are up for election at the same time. How did he control for the differences between senators who were up for election during those years and those that were not?

chun-hu commented 4 years ago

I would appreciate more details in how the authors classify the patterns of the press since we are learning a lot of classification methods in the class. In addition, have the authors compared across different methods to make sure that their approach is valid?