thisandagain / washyourmouthoutwithsoap

A list of bad words in many languages.
MIT License
96 stars 21 forks source link

Set source locale #3

Closed ericrosenbaum closed 5 years ago

ericrosenbaum commented 5 years ago

Prevent false positive results for common words (see https://github.com/LLK/scratch-vm/issues/1891).

The problem was that we were inadvertently using the google translate auto-detect feature. Some of the bad words were being translated from their meaning in other languages to common words in English (e.g. 'hore' is auto-detected as a slovak word meaning 'up'). As a result, we had some unexpected common words in all the language lists, including English.

This change sets the source language explicitly to English, and skips translation for the English output list.

I also added a unit test for the known issues ('up' in English and 'comment' in French).

thisandagain commented 5 years ago

Published as washyourmouthoutwithsoap@1.0.2