FriendsOfFlarum / filter

Flag potentially offensive posts
MIT License
8 stars 11 forks source link

Posting does not work if there are a lot of bad words added in the extension settings #28

Closed rafaucau closed 3 months ago

rafaucau commented 3 years ago

Bug Report

Current Behavior If I add a very large number of words to filter, writing posts does not work at all.

preg_replace_callback(): Empty regular expression in <b>vendor\fof\filter\src\Listener\CheckPost.php</b> on line <b>85</b><br />

Steps to Reproduce

  1. Add very large number of words to filter.
  2. Try writing a post that doesn't contain a single one of these words.

Screenshots image image

Environment

dsevillamartin commented 2 years ago

Is this still an issue?

EDIT: When I put a lot of words, the generated RegExp got truncated in the DB and so didn't work, but didn't cause an error.

rafaucau commented 2 years ago

@datitisev Is this still an issue?

https://user-images.githubusercontent.com/25438601/149270702-e585440f-5876-4aae-b10d-f2270783505c.mp4

I checked how it looks in the database. And it looks weird. image

dsevillamartin commented 2 years ago

I'm assuming that array/regex is getting truncated because it's too big. Solution is probs to build it after a post but then that's wasting a lot of time creating a regex on every post save... 🤔 not sure what the best solution is.

rafaucau commented 2 years ago

@datitisev I'm assuming that array/regex is getting truncated because it's too big.

If you would like to test, here is the list of words I use: polish_bad_words.txt

Solution is probs to build it after a post but then that's wasting a lot of time creating a regex on every post save... 🤔 not sure what the best solution is.

Maybe a separate table where each word would be on a separate row? In one column the word, in the second column the generated regular expressions.

Or a separate table where a column would be longetxt.

dsevillamartin commented 2 years ago

Maybe a separate table where each word would be on a separate row? In one column the word, in the second column the generated regular expressions.

I don't think this would solve much... this would be a large query on every post save, probably taking up more time than simply generating the regex on post save from the word list 🤔.

Large query as you'd be retrieving hundreds of rows at once, I assume not a good idea.

DavideIadeluca commented 3 months ago

@rafaucau Should be resolved with https://github.com/FriendsOfFlarum/filter/pull/53.