vzhou842 / profanity-check

A fast, robust Python library to check for offensive language in strings.
https://pypi.org/project/profanity-check
MIT License
612 stars 113 forks source link

How would one go in order to add spanish support? #5

Closed zardilior closed 5 years ago

vzhou842 commented 5 years ago

You'd need a labeled spanish dataset to train on, which unfortunately I don't have.

zardilior commented 5 years ago

@vzhou842 we have seen there are no good spanish profanity checkers available, we would be interested in generating that labeled spanish dataset to train on. How would this dataset be in order to be useful to you?

vzhou842 commented 5 years ago

You'd just need to have a large size of examples of both profane spanish text and clean (not profane) spanish text. Ideally you'd probably want 100k+ examples.

zardilior commented 5 years ago

mmm.... If I make an open source list it might be possible

fractefactos commented 3 years ago

... That would be great.