RocketChat / Rocket.Chat

The communications platform that puts data protection first.
https://rocket.chat/
Other
40.04k stars 10.33k forks source link

Bad-words filter only works for roman character #12430

Open frdmn opened 5 years ago

frdmn commented 5 years ago

Description:

The bad-words filter, only seems to respect roman characters.

I could imagine this is because the default regex is used in the library:

https://github.com/web-mech/badwords/blob/e15b99e41ebf794503148b80699914ebcf12f173/lib/badwords.js#L21

There is a mention of multi-lingual support with an example in the project README.md:

var filter = new Filter({ replaceRegex:  /[A-Za-z0-9가-힣_]/g }); 
//multilingual support for word filtering

Steps to reproduce:

  1. Go to admin UI → Message

  2. Now set ...

    • "Allow Message bad words filtering" to True
    • "Add Bad Words to the Blacklist" to test,example,Привет
  3. Save configuration

  4. Go to any channel and send a message with the content: test,example,Привет

Expected behavior:

All of the configured "bad words" should be censored:

frdmn: ***,*******,******

Actual behavior:

frdmn: ***,*******,Привет

Server Setup Information:

Additional context

screen shot 2018-10-26 at 10 21 35
engelgabriel commented 4 years ago

Now that we have the Content Filter app for that, we should remove the functionality from the main code altogether.

We must make sure the Content Filter app can work with the non-roman characters.