YouHaveTrouble / Censura

Advanced censorship plugin for minecraft servers
GNU General Public License v3.0
9 stars 4 forks source link

Replace normalization algorithm #3

Closed TheEpicBlock closed 3 years ago

TheEpicBlock commented 3 years ago

This replaces the normalization algorithm with something similar to Pagi-Bot's current algorithm. Instead of removing all non a-z characters, it only replaces them when around single letter. Examples:

t est       -> test
tés t       -> test
t e s t     -> test
t.e.s    t  -> test
test test   -> test test

This prevents matching from occurring between words. A downside to this is that te st will not be detected. I believe this is a valid compromise however. Like mentioned before, a similar algorithm has been running in Pagi-Bot for a long time.

Besides this, the pr also removes diacritics and uses replace instead if replaceAll to prevent using a regex. This pr has been tested on paper 1.16.5.

TheEpicBlock commented 3 years ago

Note, this currently doesn't detect t eeeee st. Which can be fixed by swapping the noRepeatChars and normalizedString. Not done here in fear of merge conflicts