3chospirits / badwords-filter

An easy-to-use word filter with advanced detection techniques. A lightweight package with zero dependencies.
9 stars 2 forks source link

Bug: Bad words with accented characters not getting detected #3

Open CookedApps opened 2 years ago

CookedApps commented 2 years ago

Hey, I think I found a possible bug: Defining a bad word in a filter list with accented characters, will not filter the word if you write it exactly the same, but only when you normalize the characters first.

Example:

  1. Define the filter with a custom bad word: const filter = new Filter({list: ["wörd"]});
  2. Filtering the bad word will result in a false negative: filter.isUnclean("wörd") = false
  3. Filtering with normalized characters will result in a false positive: filter.isUnclean("word") = true

Expected behavior:

3chospirits commented 2 years ago

This filter is designed for only English. There are very little characters with accents that need to be censored out. In that case, using the non accented version of the filter would make things a lot easier. It's expected that when you load in the words it's already normalized.