UChicago-Computational-Content-Analysis / Readings-Responses-2024-Winter

1 stars 0 forks source link

7. Deep Learning to Perform Causal Inference - [E1] Field, Anjalie and Yulia Tsvetkov. #19

Open lkcao opened 9 months ago

lkcao commented 9 months ago

Post questions here for this week's exemplary readings:

  1. Field, Anjalie and Yulia Tsvetkov. “Unsupervised Discovery of Implicit Gender Bias”. arXiv.org preprint: 2004.08361.
michplunkett commented 7 months ago

I appreciated them opting for unsupervised modeling as opposed to the opposite; it seems like a much more scalable route to accomplish this task. It also potentially adds some measure of standardization to the categorization of statements, which helps navigate some of the issues around having an actual person do the categorization work. Given that the data is trained on Facebook and Reddit data, are there any concerns about the external validity of this type of experiment? What are reasonable expectations to have on bias detection models when they are used outside of their specific context?

bucketteOfIvy commented 7 months ago

The model proposed in this task relies on propensity matching, and (informally) attempts to control for confounding caused by "prompts" for certain sorts of replies. So, to pull the author's example, the model attempts to control for situations where women disproportionately ask "Do I look good?", resulting in a model biased that views common responses to that sentence as bias. The direct implementation of the model seems to rely on the data being conversational (i.e. such that there is a message that prompts a response), but often we're interested in bias in datasets that are not conversational, such as newspaper datasets. In these cases, is it possible to propensity match on something else, such as article subject, or are entirely different approaches more appropriate?

erikaz1 commented 7 months ago

This is a fascinating paper! that interprets model performance with a twist in order to determine gender bias in text. The authors use propensity matching to control for confounds in the Comment Text, which directly refer to potentially gendered content in the Original Text. If the resulting model still predicts the gender label with high confidence, this would be a sign of implicit bias.

Field & Tsvetkov state that overly gendered language such as "Bro" are substituted. Aren't explicitly gendered objects still potential examples of bias? Why is "Bro" not indicative of implicit sentiment but "Beautiful" is? Where do we draw the line? How have our understanding of implicit and explicit biases changed over time, if at all?

ana-yurt commented 7 months ago

This is a fascinating paper! I wonder if the authors also managed to control the traits of comment authors, since the comment authors did not randomly decide to follow the original writers. For example, are certain types of people more likely to follow female politicians versus male? How might we control this?

Carolineyx commented 7 months ago

I like the idea of unsupervised learning that enables machines to discover things humans typically can't detect. However, my question is whether it really makes sense for certain biases that humans may not recognize as biases, or if gender roles themselves are socially constructed and performed. How can a machine differentiate between 'different choices/opinions' and 'bias'?

JessicaCaishanghai commented 7 months ago

Unsupervised learning is like the quantitative version of grounded theory, indicating that no default concepts have been made to form the theory. However, I also think the the idea is very likely to cause some bias easily. I think algorithms are still designed by human-beings and there are inevitably some critical behaviors. How different cultures may influence the unsupervised learning since there obviously will be some difference ?