adaamko / POTATO

XAI based human-in-the-loop framework for automatic rule-learning.
MIT License
47 stars 8 forks source link

Building rules for homophobia detection - issues #54

Open kuma-rtin opened 2 years ago

kuma-rtin commented 2 years ago

As we agreed on in our previous meeting last Wednesday, I have compiled a few examples for rules that are (a) either hard to formulate for me or (b) don't work as expected (be it a problem with the parsed graphs, a problem with the formulation of the rule or something else).

A. Doesn't work as expected: "I am gay" and related structures

I have noticed that the parsed UD graphs seem to be a bit off sometimes. I want to exclude tweets that include some meaning of "I am gay" (similarly for the words dyke, lesbian, queer etc.), and therefore I formulated the following exception to a rule: (u_1 / gay :nsubj (u_4 / I)). Unfortunately some tweets that include this were still matched:

  1. "im gay i love fathers dm for dilfs" image

Here the parser did not match "I" as a subject in the first part of the sentence.

  1. "when i say i only like seven men i mean i only love seven men bc im fucking gay" image

Same as in 1.

  1. "I ' m super gay fuck" image

This graph is not correct. The peculiar spelling of I'm and the f*** at the ending of the sentence might have thrown the parser off.

  1. "yea im def a dyke cause pretty boys r so ugly to me n also all the rest of them" image

All of these seem to be problems with "internet" spelling and/or internet slang.

B. Doesn't work as expected: Plural forms

UD graphs don't capture the number of nouns. Therefore it is not possible to only match the plural forms of nouns. In my case this led to the following problem: the word queer in its plural form queers is often used in hatespeech against LGBTQ+ people in the dataset. If I would be able to match the plural form of the word queer this would be a very easy and comprehensible rule. See the following examples:

  1. "and all this butthurt faggotry just makes me want to join twp to spite you queers"

    image
  2. "queers in olympics are the same in the military a huge fucking mistake"

    image
  3. "there shouldnt be an arguement against this but once the queers got a voice it was prolly too late"

    image

I solved this by formulating the rule as follows: (u_1 / .* :obj|nsubj (u_2251 / queer)), because this plural form of queer mainly talks about the group of queer people and therefore is most often present in the form of an object or a subject. Unfortunately then the following sentence is matched (which wouldn't happen if I were able to make the rule only apply to plural forms): "so then queer is not queer how about homosexuality"...

image

C. Formulation problem: words in the same sentence/clause

I would like to match tweets that include the word gay and some negative words like f***, sh**, kill, die etc. I want to do this in order to be able to capture all the different meanings these negative words can have (e.g. f gays, but also fing gays). The problem is that this rule only seems to be valid when gay and the negative word are in the same sentence or clause (which suggests that the negative word is somehow related to the word gay). Here are some examples that I would NOT want to match:

  1. "i feel like such a yt gay when i listen to kim petras but fuck i cant resist im so sorry"

    image
  2. "it okay to be white black straight or gay but it is not okay for you to stop at a yellow light when we both could have fucking made it"

    image
  3. "i miss the old queer eye where they d turn the straight guy into a slutty club gay with snake skin boots and frosted tips and instruct him go say shit like yo it fresh"

    image
  4. "im the gay one nigger you the one fucking my ass"

    image

This idea for a rule is really just an idea and I am not so sure anymore that it is a good rule. If the formulation of the rule is not too complicated, I would nevertheless want to try it out and see the results.