vzhou842 / profanity-check

A fast, robust Python library to check for offensive language in strings.
https://pypi.org/project/profanity-check
MIT License
628 stars 117 forks source link

suggestion:Use a leet speak pre-filter on the data input before vectorizing #6

Open ArEnSc opened 5 years ago

ArEnSc commented 5 years ago

it has a hard time picking up on less common variants of swear words like "f4ck you" or "you b1tch" because they don't appear often enough in the training corpus. Never treat any prediction from this library as unquestionable truth, because it does and will make mistakes. Instead, use this library as a heuristic.

use a leet speak prefilter on the input