cc-archive / open-ledger

Prototype code and examples for work on the Creative Commons "CC Search" project
MIT License
48 stars 23 forks source link

Add basic profanity/offensive term filter to List data #135

Closed lizadaly closed 7 years ago

lizadaly commented 7 years ago

List descriptions and titles are user-generated content that can be made public. Ensure that basic precautions are taken to prevent obvious profanity or harmful terms from being included in these fields.

lizadaly commented 7 years ago

@robmyers Do you have a word blacklist already in use on the blog/wiki?

rheaplex commented 7 years ago

No word list, we're just using Akismet. Which might be an idea, there's a Python library for the API.

lizadaly commented 7 years ago

Can you send me the Akismet auth creds out-of-band?

lizadaly commented 7 years ago

Implemented both Akismet (spam) and https://github.com/dariusk/wordfilter for word list filtering. Note that the word list filter may include many false positives; we can tune if it is too restrictive.