Computational-Content-Analysis-2020 / Readings-Responses

Repository for organising "exemplary" readings, and posting reponses.
6 stars 1 forks source link

Counting Words & Phrases - Gentzkow & Shapiro 2007 #14

Open jamesallenevans opened 4 years ago

jamesallenevans commented 4 years ago

Post questions here for:

Gentzkow, Matthew & Jesse M. Shapiro. 2007. “What Drives Media Slant? Evidence from U.S. Daily Newspapers.” Econometrica 78(1): 35–71.

heathercchen commented 4 years ago

As an economics major student, I must admit this article itself is entirely novel in that it applies the content analysis method to obtain its main independent variable of interest (which is the slant index), and went far beyond the scope of traditional economic methodology. But I am worried about one important hypothesis in this article. For the simplicity of models and due to the limitation of data, the authors hypothesize that residents' ideologies within the same zip-code zone are the same, which is not the case in reality. It is obvious that we do not choose where to live by our neighbors' partisanship. Therefore my question is, is there any possibility that the article can release this strict hypothesis?

sanittawan commented 4 years ago

I am impressed by the research methodology in this paper! The question that I have pertains to the selection of the statistic that is used for selecting the top 1,000 phrases in section 3.1 of the paper. The authors choose Pearson's χ^2 for the task, but I wonder what other alternatives are.

@heathercchen asks a great question. You might have already seen this, but the authors refer to their 2007 paper on page 50 that their "main findings survive in a model that allows explicitly for within-zip code heterogeneity in political ideology."

luxin-tian commented 4 years ago

This paper integrates computational analysis methods with the canonical economic modeling as well as empirical analysis in such a novel and precise way that the conclusion robustly reveals important implications for media industrial regulations.

iamlaurenbeard commented 4 years ago

I am interested in the fact that the researchers selected the year of 2001 to address their questions of media slant -- and how the findings may change if the authors were to pull newspapers from vastly different periods in time (i.e. a presidential election year). The authors "show that consumer demand responds strongly to the fit between a newspaper’s slant and the ideology of potential readers, implying an economic incentive for newspapers to tailor their slant to the ideological predispositions of consumers" and "find much less evidence for a role of newspaper owners in determining slant." However, I wonder how authors such as these could connect these findings to the socio-political context from which their data is derived.

ckoerner648 commented 4 years ago

Gentzkow and Shapiro 2007 claim that newspapers can increase their profits if they shift the content to the predominant political attitude in a given zip-code area. I’m contributing to @heathercchen’s point that it is not very realistic to assume that customers in a given region all share the same political world view. If the authors would instead assume a diverse population, and given the fact that newspapers seldom face competitors in their distribution area, newspapers could probably serving the approximately biggest possible fraction of the population, including both, left and right-leaning readers. If newspapers would shift, e.g., to the right, the could indeed gain readers on the right, but those gains would maybe not outnumber the loss of readers in the left-leaning population. Therefore, it could plausibly be the dominant (profit-maximizing) strategy for a newspaper to pursue a somewhat balanced reporting approach (with a given tendency reflecting the proportion of republicans and democrats in a region) to gain readers from both sides.

luisesanmartin commented 4 years ago

Building on the points mentioned by @sanittawan and @luxin-tian, I wonder which alternative types of indicators could be used to measure if a given phrase is used more frequently by the Democrats or the Republicans in Congress. I understand that one of the criteria of the authors for choosing this statistic was its computational simplicity. Perhaps with the advances in computational power in recent years, they could have chosen another measure?

Related to that, I also noticed that this paper was published before the spread of Statistical Learning techniques and their use in Economics. That made me think if we should expect these results to change if instead of using the method described in the paper, the authors would have chosen a Learning algorithm to predict how close is the language of a newspaper with respect to the words chosen by congresspeople as a measure of slant.

sunying2018 commented 4 years ago

This article demonstrates an innovative method to measure the slant of a newspaper. Just as mentioned by @sanittawan, for the "feature selection", this article chooses the phrases with greatest values of this statistics for different length. I am curious about the reason statistics used here. In addition, I am wondering if it is possible to implement other feature selection techniques such as PCA in the process of phrases selection.

vahuja92 commented 4 years ago

I found it interesting that the authors used raw term frequency to identify important Republican and Democrat terms, instead of a score like the tf-idf score, which would identify the importance of each term. Could using different methods to score the importance of terms used by Republicans and Democrats impact the words associated with the parties?

arun-131293 commented 4 years ago

It is very interesting to read how content can be analyzed to gain explanations for socio-economic phenomena. As table IV on p. 61 shows–that the actual slant of the more than 400 newspapers analyzed in the study almost equals the profit maximizing slant predicted by the model. This result draws the very basic alleged function of newspapers into question: they could be interpreted not as entities trying to produce reports reflecting real events in the world, but as institutions that are incentivized to reflect the world view of their readers. A qualitative approach about a similar topic is the book, "Manufacturing Consent", a canonical work in media bias, which comes to similar conclusions.

rkcatipon commented 4 years ago

I would love to see the authors repeat this work for today's political polarization and media fracture, especially now that media outlets no longer have to give equal coverage time to candidates. What I find interesting in this work is the theoretical construct that political bias transfers in one direction, from politicians to news. Now we're seeing the opposite, where the president is picking up phrases from Fox News.

Because the frequency method seems agnostic to slant directionality, could this method be applied to pick up phrases from news outlets to politicians, as opposed to the other way around?

jsmono commented 4 years ago

The method of this article is amazing, it seems very useful to study the bias of platforms or accounts. It seems like they used two datasets for the research, one is based on the newspaper they collected, and the other one is used to test the newspaper dataset. I''m wondering when studying a different platform, such as reddit, is it necessary to use another testing dataset so that the language and the phrases are up to date.