UChicago-Computational-Content-Analysis / Readings-Responses-2023

1 stars 0 forks source link

6. Prediction & Causal Inference - [E3] 3. Saha, Koustuv, Eshwar Chandrasekharan, Munmun De Choudhury. 2019. #24

Open JunsolKim opened 2 years ago

JunsolKim commented 2 years ago

Post questions here for this week's exemplary readings: 3. Saha, Koustuv, Eshwar Chandrasekharan, Munmun De Choudhury. 2019. “Prevalence and Psychological Effects of Hateful Speech in Online College Communities” WebSci: Proceedings of the 10th ACM Conference on Web Science, p. 255–264

konratp commented 2 years ago

I think this paper was built on an interesting premise, but I wonder why the authors chose reddit as a platform of interest. The authors don't really address their case selection in my opinion. In my experience, many of the comment sections on my alma mater's subreddit are filled with alums reminiscing about their time in college more so than students currently enrolled at the school. Facebook groups for specific classes (e.g. UChicago Class of 2022) seem like a much more widely used outlet for currently enrolled students. Am I correct in the assumption that sometimes, scientists conducting content analyses are a bit sloppy in their case selection? It shouldn't be that surprising to anyone that on platforms where users are largely anonymous (such as reddit or 4chan, as opposed to facebook), users tend to be more candid about their hate speech. But is this really the kind of speech currently enrolled college students are experiencing in their daily lives?

ValAlvernUChic commented 2 years ago

The paper uses propensity score matching to control for observed covariates. Though, if I remember correctly, PSM does not take into account latent covariates that could act as confounders. I was wondering whether there are other techniques that could be used to account for these latent elements (though it seems by intuition incredibly difficult to make such assumptions about unobserved covariates)

Qiuyu-Li commented 2 years ago

This is a very interesting paper and addressed an important issue. My question here is about the identification of the paper: Is it possible that the frequency of using Reddit can be a confounder. For one thing, they are more likely to be exposed to hate speech. For another, they are more used to expressing their private feelings and views on the internet.

NaiyuJ commented 2 years ago

I find that the "Other" hate category accounts for the largest proportion of hate posts there. My question is, why not make a bit more pre-defined categories that may capture the topic currently included in "Other"? And more importantly, when we define pre-defined categories, how can we reach the best number of categories that capture more information we can get from the posts?

GabeNicholson commented 2 years ago

I think this paper was built on an interesting premise, but I wonder why the authors chose reddit as a platform of interest. The authors don't really address their case selection in my opinion. In my experience, many of the comment sections on my alma mater's subreddit are filled with alums reminiscing about their time in college more so than students currently enrolled at the school. Facebook groups for specific classes (e.g. UChicago Class of 2022) seem like a much more widely used outlet for currently enrolled students. Am I correct in the assumption that sometimes, scientists conducting content analyses are a bit sloppy in their case selection? It shouldn't be that surprising to anyone that on platforms where users are largely anonymous (such as reddit or 4chan, as opposed to facebook), users tend to be more candid about their hate speech. But is this really the kind of speech currently enrolled college students are experiencing in their daily lives?

I agree with this and there is also something quite qualitatively different from active hate speech in real life compared to hate speech online, at least in my opinion. Whenever I read disparaging comments online, I just move on as if nothing happened, and quite frankly, I don't really remember them at all thinking back retrospectively. However, when people are being hateful in person, I am much faster at recalling these examples and they are much more emotionally salient. This is all to say that I don't think the question asked matches the psychological effects and that online hate speech is quite a bit different (not saying online hate speech doesn't matter by the way).

Jasmine97Huang commented 2 years ago

This is an interesting paper and definitely highlights the challenges for using text in casual inferences. In general, their use of text (reddit comments) is limited to classification (of types of hate speech and stress level) to extract high-level information of interest. Lexicon-based approach is useful but I am wondering how can neural network-based methods be applied in their research setting? For example, would the causal BERT be adopted to better capture the types of hate speech, the intensity and offensiveness?

facundosuenzo commented 2 years ago

Very interesting paper! I wondered how we can ensemble data for multiple sources (ex. Reddit plus Tumblr) to create a more robust classifier. What would be the consequence in terms of inferring causality in this scenario? (for instance using data of both Reddit and Tumblr to predict causality in other social networks?

LuZhang0128 commented 2 years ago

When the authors are comparing the college subReddit to "elsewhere on Reddit," they are choosing very general subReddits like r/AskReddit, r/aww, and r/movies. I don't think the comparison is fair since they fail to hold everything else constant. It could be true that younger people would just post more extreme posts online. It would be better if they compare the online college community with matching.

hshi420 commented 2 years ago

I think their choice of subreddit can be a confounding variable. Why don't the author choose the subreddits in a more systematic way? Or what's the mechanism behind their choice of subreddits for comparison?

kelseywu99 commented 2 years ago

Going off with the above thread, I have the similar question on the possible selection bias being presented in the paper. The nature of subreddits and their selection of texts are to be concerned since subreddits is more of a eclectic amalgamation of extreme posting.

sudhamshow commented 2 years ago

I am unclear about the motivation of using a lexicon based approach without considering for a set context. Wouldn't this result in inaccuracies considering the experiment's goal is to check for causality? For e.g swear words might sometimes be used in a tone of excitement or when 2 users know each other closely. From a first hand experience, I have also observed curse words and lexicon deemed as hate speech is frequently used while congratulating or appreciating (it's bizarre) someone, especially by college teens.

ttsujikawa commented 2 years ago

This paper allows me to obtain very interesting insights! Talking about the casual inference on hate speech on campuses, I think that the impact of online hate speech and the one of interpersonal/interactional hate speech should be differentiated. Being said that, I was wondering if there is any way to more accurately capture the sentiments of students movements/interactions such as questioners?