OSMCha / osmcha-frontend

Frontend for the osmcha-django REST API
https://osmcha.org
ISC License
124 stars 38 forks source link

Profanity Filter #237

Open jharpster opened 6 years ago

jharpster commented 6 years ago

Expand the profanity filters and make them multi-lingual.

Brief Description

The existing word list is inadequate to address more than the simplest profanities.

What is the motivation / use case for this feature?

Create more robust vandalism detection

What is the expected behaviour ?

Consider incorporating a broader list of profanities from this list.

mvexel commented 3 years ago

I notice that the connected MapRoulette Challenge has a very high number of tasks marked as Not an Issue (false positive). As the MapRoulette superuser I am getting some complaints about the tasks in this Challenge. I would recommend that we disable this MapRoulette Challenge until the quality of the filter can be improved. Thanks.

Screen Shot 2021-02-15 at 11 59 21 AM
matkoniecz commented 3 years ago

One of glaring issues is that MapRoulette Challenge is not listing what is supposed to be a profanity.

So I have no idea is it a complete bug, pattern matching English profanities to text in other languages or something else.

Looking at it I am unable to spot what caused it to be reported, not sure which English profanity matched here. I have not seen a single valid report in Poland.

screen02

willemarcel commented 3 years ago

@mvexel Thanks for the feedback. I have stopped to update this challenge.

@matkoniecz I'll evaluate the possibility of improving or disabling the profanity filter on the next few days.

bugdebugger commented 3 years ago

@willemarcel

I too stumbled on this problem on MapRoulette

  1. You can't tell why the node was tagged with the profanity tag
  2. Most tasks are resolved as "Not an issue"
  3. I did quite of few of them myself. All were false positives

So I went digging and figured the following things out. Some of this is probably obvious if you are familiar with OSM and the code around it. I wasn't :smile:

Then I checked the word-lists in all languages I understand

The many false positives are caused by the combination of the above findings.

Some examples of what currently happens

matkoniecz commented 3 years ago

Thanks for the feedback. I have stopped to update this challenge.

Would it be possible to take it down completely or archive?

https://maproulette.org/browse/challenges?query=profanity

screen02

It would be worth saving time on manual marking 2800 entries as invalid by people using MR.