EddieHubCommunity / EddieBot

Inclusive words Discord bot (no more "hey guys")
http://discord.eddiehub.org
MIT License
174 stars 138 forks source link

Does not flag offensive words #405

Closed adityaraute closed 3 years ago

adityaraute commented 3 years ago

Description

Eddie bot flags non-inclusive pronouns, but does not flag many offensive words. It will be better if EddieBot can do it.

Screenshots

I can add screenshots but I will refrain from doing so as the words can be offensive. Head over to the Discord Server's #bot-chat channel to find live examples.

Additional Context

Will add the screenshots if someone explicitly requests for it.

Join Eddie's discord community here

github-actions[bot] commented 3 years ago

It's great having you contribute to this project

Feel free to raise an Issue! Welcome to the community :nerd_face:

If you would like to continue contributing to open source and would like to do it with an awesome inclusive community, you should join our Discord chat and our GitHub Organisation - we help and encourage each other to contribute to open source little and often 🤓 . Any questions let us know.

Vyvy-vi commented 3 years ago

We could perhaps use a second library for this? (personally, I was thinking, if we could run the message content through some sentiment classifier lib, to get more context and then send the interpretation through to alex, I think we could somehow account for this in that too 🤔 )

adityaraute commented 3 years ago

We could perhaps use a second library for this? (personally, I was thinking, if we could run the message content through some sentiment classifier lib, to get more context and then send the interpretation through to alex, I think we could somehow account for this in that too 🤔 )

Would be too heavy to do that according to me. A bot should not be burdened with so much as we need quick responses especially in this case.

Vyvy-vi commented 3 years ago

oh :+1: Maybe (though I'm not sure about this), we could add an extra RESTRICTED_WORDS_LIST, in the config.ts file, and regex these to the message content, in order to check text that isn't flagged by alex?

(*Is there a way to add a project-specific word to alexjs-ban-list?)

eddiejaoude commented 3 years ago

What about something like this https://github.com/web-mech/badwords#readme

Vyvy-vi commented 3 years ago

That looks interesting :+1: Though, since we're considering adding another library, it reminds me of something @nhcarrigan said in the discord server Screenshot 2021-01-06 at 5 17 00 PM Is that a viable option? 🤔

adityaraute commented 3 years ago

Whatever we use, we need to have a distinction between non-inclusive pronouns and profane language. Can't judge them the same neither can we react to them similarly.

eddiejaoude commented 3 years ago

we used to have one that was custom built, but I think using library that has been used before is probably a better starting point.

I agree with @dumbcoder7 that "no tolerance" is best for bad language and there are 2 actions.

My thoughts are:

adityaraute commented 3 years ago
  • add another feature that uses a library to check for bad words, which removes the message (similar to standup command) and puts a red bot response saying something like "please do not use bad language here"

We can code this ourselves as well as use the Bad Words library here that you mentioned before. What do you recommend?

eddiejaoude commented 3 years ago

We will have to integrate the library, so we will have to write code, but the less code we right about the profanity part the better and let the library do the heavy lifting

adityaraute commented 3 years ago

I meant coding the entire mechanism and only picking the 'bad' words from the library. But sure, no reason to overwork ourselves.

naomi-lgbt commented 3 years ago

To clarify, if we scrap Alex.js and write our own parser, we could:

eddiejaoude commented 3 years ago

We originally started off with our own parser. Now seeing both ways of doing it, I definitely believe using AlexJS is the better options - then we can add to its ignore config for the ones we don't want flagged (this is a great green square for people too)

Vyvy-vi commented 3 years ago

There's a new discussion #656 that discusses about using nlp, (I'm linking here, so that perhaps we could get some context from there as well)

adityaraute commented 3 years ago

Guys, I appreciate the discussion, but a methodology to remove obviously offensive words needs to be put in place ASAP. EddieHub is growing and with more members, we don't want exposed vulnerabilities. NLP and contextual flagging can be a secondary issue but the bad words library should be implemented quickly in my opinion.

Vyvy-vi commented 3 years ago

I think we could look at alex(or the library it uses retext) and implement the bad-word-flags from there or we could look at:

What about something like this https://github.com/web-mech/badwords#readme

adityaraute commented 3 years ago

I prefer the latter, but sadly I currently do not have the time or knowledge to implement this feature. Would request whoever can, to add it as soon as possible.

eddiejaoude commented 3 years ago

Yep, something simple and sooner is best 👍

Vyvy-vi commented 3 years ago

I'm interested in working on this 🙃

eddiejaoude commented 3 years ago

The profanitySureness field is a number (the default is 0). We use cuss, which has a dictionary of words that have a rating between 0 and 2 of how likely it is that a word or phrase is a profanity (not how “bad” it is): The profanitySureness field is the minimum rating (including) that you want to check for. If you set it to 1 (maybe) then it will warn for level 1 and 2 (likely) profanities, but not for level 0 (unlikely).

AlexJS uses a profanity filter, I think we have the config incorrect

naomi-lgbt commented 3 years ago

We have the profanitySureness set to 2, which is the minimum level of flagging - we could try bumping it to 1 to see if it catches more.

Vyvy-vi commented 3 years ago

it catches more but still not as functional as needed

Vyvy-vi commented 3 years ago

moreover maybe, could we make the bot censor profane words?

naomi-lgbt commented 3 years ago

How so? The most we could do is have the bot delete the offensive message entirely, since a bot can't edit a user's message. We could have it re-send the message as an edited form, but that gets messy IMHO.

Vyvy-vi commented 3 years ago

No, I meant the embed made by the bot Example: A user says the b word EddieBot send the warning embed like this-

You used the word "b****(or some iteration of censoring)"
> this is profane.
-----
-----
Vyvy-vi commented 3 years ago

Also, slight update profanitySureness at 1 works pretty much okay @adityaraute Could you perhaps dm me examples of what the bot was not flagging as offensive? (this way we could be sure if setting that to 1 is sufficient)

adityaraute commented 3 years ago

@Vyvy-vi Yes I think Level 1 can be enough. I did mean words that are overtly offensive with no other interpretation whatsoever. This includes abusive and profane language.

eddiejaoude commented 3 years ago

Recently the word a**hole was not flagged or removed, it require manual removal. The config is still on 2 though, I will update to 1 now

https://github.com/EddieHubCommunity/EddieBot/blob/42193a569c323d36fa89b23884c4208d509978d4/src/config.ts#L103

eddiejaoude commented 3 years ago

This word is now flagged, but the whole message should really be removed

eddiejaoude commented 3 years ago

I think we can find out what has been triggered in the response object source: 'retext-equality' and if cuss is mentioned, the message can be removed - probably best to leave a reply tagging the author and saying it was removed

[
  [1:17-1:20: `his` may be insensitive, when referring to a person, use `their`, `theirs`, `them` instead] {
    message: '`his` may be insensitive, when referring to a ' +
      'person, use `their`, `theirs`, `them` instead',
    name: '1:17-1:20',
    reason: '`his` may be insensitive, when referring to a ' +
      'person, use `their`, `theirs`, `them` instead',
    line: 1,
    column: 17,
    location: { start: [Object], end: [Object] },
    source: 'retext-equality',
    ruleId: 'her-him',
    fatal: false,
    actual: 'his',
    expected: [ 'their', 'theirs', 'them' ]
  }
]

https://github.com/get-alex/alex#api

adityaraute commented 3 years ago

Yes, that is the ideal way. Someone who can implement this, please make haste. It's already been two months since this issue was raised and still cusses are possible to use.

Vyvy-vi commented 3 years ago

I'll try and open a PR after a while :)

github-actions[bot] commented 3 years ago

Stale issue message

Vyvy-vi commented 3 years ago

I think this can be closed, as this was resolved by 81b9584cf8e87989f8e3754959367643f6a1ff72