WhiteHouse / petitions

Drupal installation profile powering We The People at petitions.whitehouse.gov
https://petitions.whitehouse.gov/
1.14k stars 336 forks source link

is_profane module is naive #51

Closed jpmckinney closed 11 years ago

jpmckinney commented 11 years ago

If a petition includes the word "classic", the module will consider it profane, because "classic" includes "ass".

See The Clbuttic Mistake: When obscenity filters go wrong for common pitfalls around obscenity filters/detectors.

Here's the naive line of code.

bryanhirsch commented 11 years ago

@jpmckinney, good point. Thanks for raising it and sharing this link. If you have a vision or code for a smarter is_profane filter, please share it. I'd be happy to take a look.

This module actually works as-intended. is_profane is designed to (1) empower site admins to determine what types of things they want to flag and (2) to be extended by other modules that want to react to things being flagged as potentially profane. How administrators or other modules respond to content that gets flagged is up to them.

I'm going to mark this issue closed for now, since the "classic" issue is a feature, not a bug. But if you think the issue needs further consideration, please clarify and reopen.

jpmckinney commented 11 years ago

Instead of using strpos you should use preg_match, e.g.:

if (preg_match("/\b$string\b/", $term)) {

That way, you'd match "ass" but not "classic". I don't know any administrator that wants a lot of false positives to review (maybe an admin with a lot of time on their hands?).

A more developer-friendly blog post is by Jeff Atwood at http://www.codinghorror.com/blog/2008/10/obscenity-filters-bad-idea-or-incredibly-intercoursing-bad-idea.html

I can't believe that false positives are considered a "feature".

jpmckinney commented 11 years ago

By the way, it's impossible for a non-admin to re-open issues - you'd have to do it.