This replaces the normalization algorithm with something similar to Pagi-Bot's current algorithm.
Instead of removing all non a-z characters, it only replaces them when around single letter. Examples:
t est -> test
tés t -> test
t e s t -> test
t.e.s t -> test
test test -> test test
This prevents matching from occurring between words. A downside to this is that te st will not be detected. I believe this is a valid compromise however. Like mentioned before, a similar algorithm has been running in Pagi-Bot for a long time.
Besides this, the pr also removes diacritics and uses replace instead if replaceAll to prevent using a regex.
This pr has been tested on paper 1.16.5.
Note, this currently doesn't detect t eeeee st. Which can be fixed by swapping the noRepeatChars and normalizedString. Not done here in fear of merge conflicts
This replaces the normalization algorithm with something similar to Pagi-Bot's current algorithm. Instead of removing all non a-z characters, it only replaces them when around single letter. Examples:
This prevents matching from occurring between words. A downside to this is that
te st
will not be detected. I believe this is a valid compromise however. Like mentioned before, a similar algorithm has been running in Pagi-Bot for a long time.Besides this, the pr also removes diacritics and uses
replace
instead ifreplaceAll
to prevent using a regex. This pr has been tested on paper 1.16.5.