Measuring Meaning & Counting Words - (E1) Gentzkow & Shapiro 2007

jamesallenevans commented 4 years ago

Post questions here for the the following exemplary reading:

Gentzkow, Matthew & Jesse M. Shapiro. 2007. “What Drives Media Slant? Evidence from U.S. Daily Newspapers.” Econometrica 78(1): 35–71.

nwrim commented 4 years ago

For Gentzkow & Shapiro (2007)

This was a highly interesting paper! I really liked the fact that this is as simple as counting phrases, ruling out words by (as authors put it) arbitrary cutoff and applying $\chi^2$ statistics produced a robust index of ideology. This was one of the more interpretable ways of calculating ideology for a person from non-polisci background. Two questions I had was:

Will this kind of methodology work even when the underlying phenomenon is a bit more complex than the bipartisan state of American politics? I think as the categories increase, fewer words will be strongly coordinated specifically to one category. I guess this would not be a problem if we have a large dataset, but I am curious if there is a way to circumvent this kind of problem nicely.
This is the first time I saw $\chi^2$ statistics to align variables (I usually saw this as a null-hypothesis test assuming all categories following the same distribution as the null hypothesis). How does this compare to other metrics like TF-IDF? Is there specific strength to $\chi^2$ statistics that make them more attractive in some settings? (Also, I noticed that this is a relatively old paper, so would love to hear if there are newer methods to access word binding to a specific category)

tianyueniu commented 4 years ago

I found the paper extremely interesting! I second Nak's comment in that I think it is really inspiring to see how 'simple' counts can be integrated into economic models to extract powerful insights.

While the authors comprehensively evaluated the effects of consumers ideology (spatial distribution) on media slant, they also acknowledged that their underlying assumption is that most of the variation in consumer ideology would not be affected by newspapers. However, I believe that in reality, the nature of the relationship between newspaper language and consumers ideology might be bi-directional, especially if we take time into consideration. Given that, what computational methods should we use to account for bi-directional influences? Is time-series analysis a suitable option?

iarakshana commented 4 years ago

Agree with the above comments, very interesting paper but am also interested/sceptical of the scalability beyond measuring slant in just newspapers because of the use of comparing word/ phrase frequencies with congressional records. I feel like if we tried to do something similar with online news websites or smaller locally-circulated newspapers, it might not track in the same way?

DSharm commented 4 years ago

While reading this paper, I couldn't help thinking about the "false positives" (or negatives) we might get by counting words. For example, there might be a number of "liberal-leaning" papers that use words like "death tax" while discussing or critiquing a Republican politician's speech, or to critique the phrase itself. In this application, perhaps such errors didn't matter since newspapers were ranked relative to each other to create an index. In general, however:

Are there applications where we might run into issues with false positives/negatives when counting words to glean meaning?
Are there methods to catch such false positives/ negatives?

mattconklin commented 4 years ago

The authors develop a model of newspaper demand to explain the determinants of media slant among U.S. newspapers. The results show that media slant conforms to basic economic logic. Readers’ news preferences closely conform with their political preferences. Accordingly, firms respond to this demand by delivering ideologically slanted news content.

My question concerns an opportunity to apply the article’s analysis to specific political issues. In particular, to what extent does media slant in coverage of foreign policy related issues reflect reader preferences compared to local economic conditions? This idea is motivated by two assumptions. First, certain phrases of foreign policy rhetoric are commonly employed by Democrats and Republicans. For example, phrases associated with the post-Cold War consensus of the “liberal international order” are conveniently used by Democrats and Republicans alike when it suits their agenda. The point is that some phrases, especially on topics with relative consensus (i.e. foreign policy), transcend ideological boundaries. Second, building on economic and political science literatures, voter preferences for foreign policy are highly correlated with local economic conditions. Voters in import competing locales are historically more supportive of protectionist policies and a generally less activist U.S. global role than those living elsewhere. Based on this rationale, is the article’s main finding likely to change if voters’ foreign policy preferences are economically determined?

harryx113 commented 4 years ago

While the authors used congressional record as an important piece of the data, they also pointed out the infeasibility of vectorizing the entire congressional script due to the size of the data. Instead, they chose to use a set of pre-determined phrases that are ideologically indicative. How would the subjectivity introduced by these phrases affect the results? Is congressional record still necessary if the slant metrics are pre-determined?

Lesopil commented 4 years ago

This is a really interesting article, especially since I am currently working on Soviet newspapers in the transition from communism to capitalism. It would be very interesting to see how this study holds up in a supposedly non-consumerist society. However my question is about the effect of the owner's political slant on the slant of the newspaper. In the last section the address the possibility of editors influencing the political slant of a newspaper, but I did not find this counterargument to be compelling, primarily because I think it misses the point. From what I know of the newspaper industry, it is the editors that have the most control over what is published and how. That the owners are absent from this picture does not surprise me. I think that the lack of an investigation into the slant of the editorial staff is a significant flaw in this project, but I am also not sure that this data is even available or possible to obtain. My question, then, is about research ethics, so to speak. At what point in space does "big data" stop being anonymous? At what point does the data become small enough that it can be linked to individuals in ways that the individuals may not want? Is this a significant concern?

minminfly68 commented 4 years ago

It is a super interesting paper that develops a new measure of newspapers' slant. Author admits that their measure of slant (core dependent variable) is a little bit broadly aggregated, hence I would like to doubt the aggregation effect towards the slant, particularly the change of political slant along the timeline. Maybe after/during a huge movement, newspaper tend to switch its prior political slant or they "betrayed" their slant or waving between two parties. In this scenario, how could we ensure that we correctly measured the slant? I believe using panel data to prove that there are no significant change of their political slant is necessary for this particular study.

linghui-wu commented 4 years ago

This is really a thought-provoking paper, I learned a lot both as an economics student and a content analysis student.

From the standpoint of natural processing, I appreciate the innovatively-constructed media slant indicator, which is quite understandable and makes sense intuitively and theoretically. However, I noticed that the researches employed a relatively short period for analysis.

For each database (the NewsLibrary database and the ProQuest Newsstand database), we use an automated script to calculate the number of articles containing each phrase in each newspaper during the calendar year 2005.

Plus, they further pointed out the total content sample is "433 (pieces of) newspapers". So I would like to know 1) if there is a change of ideology of the news media, how would the dynamics affect the results? 2) can it be considered as "a large-scale empirical" analysis as mentioned by the authors? Or how would the definition of "big data" differs in data-based and textual studies?

liu431 commented 4 years ago

This is a very interesting article. However, reading newspapers seems out of fashion these days as people get information from social media. Also, social media could track people's reactions after reading, such as liking and commenting. I am wondering how would the model be different to accommodate such customer learning behavior?

bazirou commented 4 years ago

Nice work. In the paper, the author mentions that Firms respond strongly to consumer preferences, which account for roughly 20 percent of the variation in measured slant in our sample and I wonder how variation is defined here.

shiyipeng70 commented 4 years ago

It is an illuminating paper, especially those seemingly trivial but also significant step in clearing those noises. However, i am wondering how can the authors directly target at the fit with customers’ political attitude as a determinant of newspaper’s slant. Apart from customers’ political attitude and owners of media outlets, could there be other factors that influence meida’s slant?

Yilun0221 commented 4 years ago

I think it is very interesting to explore the topics about the media and the statesmen. I am very interetsted how the researchers choose models for their research gool. For example, they 'also estimate a model of the supply of slant, in which we allow slant to respond both to the ideology of a newspaper’s customers and also to the identity of its owner'. I want to know more about the selection techniques of models.

Computational-Content-Analysis-2020 / Readings-Responses-Spring

Measuring Meaning & Counting Words - (E1) Gentzkow & Shapiro 2007 #2