languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
12k stars 1.37k forks source link

clean up for PROFANITY #1938

Closed TiagoSantos81 closed 4 years ago

TiagoSantos81 commented 4 years ago

Clean-up from regressions profanities.txt

danielnaber commented 4 years ago

Here's a list of what sound like false alarms to me (PROFANITY and RUDE_SARCASTIC, from Tatoeba):

It's fine, just make sure you don't say that again.
God was truly glorified today!
What's new?
When the dog tried to bite me, I held him down by the neck...
Yes, sausage and sauerkraut please.
A single ray of sunlight shone through a chink in the shuttered window.
We have to nip this problem in the bud before it gets any worse....
Squaw Valley, California
illegal immigrant
...the middle finger, the ring finger, and the pinky.
What do you know about Tom's girlfriend?
knock up
Tom graduated with Magna Cum Laude.
I hope you're happy together.
transgender
Thanks a heap.
I have to change the baby's nappy.
get screwed
She has snow-white skin.
...the tick of the clock, the whoosh of cars passing by the house.
So what else is new?
genderqueer
One day our children will take over our paddy planting.

The following ones might not be false alarms, but they show that the current message isn't specific enough. It needs to explain more about why the term can be inappropriate, and possibly suggest alternatives:

Deaf-mute people talk using sign language.
The Koran does not permit Mohammedans to drink.
My aunt is from Somalia. She is Somalian.
Can I scrounge a fag?
danielnaber commented 4 years ago

More potential false alarms for PROFANITY from https://internal1.languagetool.org/regression-tests//20190913/result_en_20190913.html:

TiagoSantos81 commented 4 years ago

Yesterday a few words that I though removed passed through (e.g. illegal immigrants). Should be fixed now. Regarding this last list that I did not handle. Scientific, regional and foreign names were already tagged (but as generic spelling errors). They are rare occurrences, even for specialized writers, the writer is well aware of the special meaning it is using it and they can be basically anything. In countries were naming conventions are not mandatory for children's name registry you can have profanity names. Translate https://www.techtudo.com.br/noticias/noticia/2014/07/facebook-exclui-perfis-de-usuarios-brasileiros-com-nomes-estranhos.html if you are in for a laugh. Informal and formal tags are not yet handled, but in the coming days I will push that as well. For now, lowercase 'willy' should be tagged as something wrong, same as 'willy nilly' is.

Of course, if you think that words as "negro" are not offensive, and that "mama" and "papa" are not childish, go ahead and remove than. That is on you, but I won't do it, since I would not be doing anyone I care about a favor.

TiagoSantos81 commented 4 years ago

All casses and a few more solved. Closing.