monthly-basis / content-moderation

Determine whether text contains bad words.
https://www.monthlybasis.com
0 stars 0 forks source link

Improve regex's for immature words #6

Open leo-galleguillos opened 3 years ago

leo-galleguillos commented 3 years ago
Alireza-Sampour commented 3 years ago

Hi @leo-galleguillos, for example for crap just need create regex pattern to cover all word that can equal to crap? like craaaap crap crapp?

leo-galleguillos commented 3 years ago

Hi @leo-galleguillos, for example for crap just need create regex pattern to cover all word that can equal to crap? like craaaap crap crapp?

We just need to detect crap (case-insensitive).

Btw, do you think I should still add the "hacktoberfest" label? I'm actually not quite sure what that is.

Alireza-Sampour commented 3 years ago

Hi @leo-galleguillos, for example for crap just need create regex pattern to cover all word that can equal to crap? like craaaap crap crapp?

We just need to detect crap (case-insensitive).

However, we don't want to detect unoffensive words that contain crap, such as scrape or scrapbook.

Also, for this word, we can assume that there are no unoffensive words which start with crap.

Btw, do you think I should still add the "hacktoberfest" label? I'm actually not quite sure what that is.

Ok assign this to me, after pull request just label that with hacktoberfest-accepted :)

leo-galleguillos commented 3 years ago

Wow, this looks great, thanks so much! A couple things:

I probably should have clarified the above two points in the original spec (my apologies).

In any case, thanks again for the code you wrote. It truly looks great, and I'm impressed that you were able to figure out where and how to write the regular expression and where to write the unit test.

This is the first time we will be merging code from another developer into this project. I can't thank you enough, and hopefully soon other people will be able to use this software in real-time.

Alireza-Sampour commented 3 years ago

Wow, this looks great, thanks so much! A couple things:

* `crap` is considered an "immature" word in our software, not a bad word. Can you move the regular expression (and unit test) from the `BadWords` service to the `ImmatureWords` service?

* Can you also catch the following variation: `crappy`

I probably should have clarified the above two points in the original spec (my apologies).

In any case, thanks again for the code you wrote. It truly looks great, and I'm impressed that you were able to figure out where and how to write the regular expression and where to write the unit test.

This is the first time we will be merging code from another developer into this project. I can't thank you enough, and hopefully soon other people will be able to use this software in real-time.

Sorry, my bad! I didn't read all your code I just thought there is just Bad word file, I edited my code and submit another pull request, hope this work for you. (actually, I'm a newbie in PHP and I don't know how to run the unit test to sure about my code, I just followed your other unit test to create mine) if there is a mistake or something like that feel free to mention me, and I will be glad if my pull request merged, you label this with hacktoberfest-accepted, wish luck for your project :)