snguyenthanh / better_profanity

Blazingly fast cleaning swear words (and their leetspeak) in strings
MIT License
211 stars 70 forks source link

Quoted profanities aren't censored #33

Open shopeonarope opened 3 years ago

shopeonarope commented 3 years ago

Here's an example of what I am seeing.

>>> from better_profanity import profanity
>>> profanity.contains_profanity('I have to go pee')
True
>>> profanity.contains_profanity('I have to go "pee"')
False

I looked around and it doesn't seem like this behavior is intentional.

BradKML commented 3 years ago

tokenization and scunthropic need is a problem in this case.

shopeonarope commented 3 years ago

I notice that it's able correctly identify profanity when the word is surrounded by left and right double quotes.

>>> profanity.contains_profanity('I have to “pee”')
True

Since it's handling that specific case can it be made more general to handle other quoted strings?