areebbeigh / profanityfilter

A universal Python library for detecting and filtering profanity
https://pypi.python.org/pypi/profanityfilter
BSD 3-Clause "New" or "Revised" License
73 stars 25 forks source link

Improvements to remove_word + new method: remove_word_list #18

Open RicoViking9000 opened 5 years ago

RicoViking9000 commented 5 years ago

When I was using your (wonderful by the way) profanity filter module, I did happen to realize that there was a limitation for my use - there was no way to remove words from the extra_censor_list except for using the restore_words() method and re-defining the extra_censor_list without the word to remove.

I have forked and modified the remove_word() method to better suit my needs and possibly the needs of other people. If you see this as something other people might make use of, you're welcome to merge.

In summary, I added an argument to remove_word (defaults to True), which, if True, checks if the word is in extra_censor_list before moving on to the other lists. In addition, previously there was no easy way to remove a word from the custom_censor_list without redefining, so I modified the method to, by default, do that if a custom_censor_list is used.

I also added a new method - remove_word_list, which makes sense to me as define_words takes a list, so now we have a method that takes a list and essentially does the same thing as remove_word(), but with a list.

Finally, I updated the Docstring slightly to be in conjunction with my edits.

The most efficient and working changes are the latest commit (60b1ad5)

areebbeigh commented 5 years ago

Okay, now this needs tests.

If you're not familiar with writing tests take a look at how https://github.com/areebbeigh/profanityfilter/blob/master/tests/test_profanity.py is written. That's the file you'll be working with. Also, take a look at nose on PyPI. :+1:

RicoViking9000 commented 5 years ago

I will get to that this weekend, I appreciate the information

RicoViking9000 commented 5 years ago

I also have code locally (off github) for get_word_boundaries() and set_word_boundaries(bool), I can add them here if you prefer; I'm using these methods for my application, but you're welcome to keep it locked to instantiation only, totally up to you

areebbeigh commented 5 years ago

I also have code locally (off github) for get_word_boundaries() and set_word_boundaries(bool), I can add them here if you prefer; I'm using these methods for my application, but you're welcome to keep it locked to instantiation only, totally up to you

This should have its own PR. If you create it, make sure the changes on this PR are not included. You'll want to git checkout -b <new_branch_name> and implement these changes there. Make sure the new branch is in sync with upstream i.e areebbeigh/profanityfilter's master branch.

RicoViking9000 commented 5 years ago

Anywhere defaults to True in the method signatures, but I will redo my testing. I appreciate your patience with me here.

areebbeigh commented 5 years ago

In that case, test if anywhere=True does remove from both the lists and False doesn't. :)

RicoViking9000 commented 5 years ago

Should I remove the lines that cause ValueErrors due to a certain word not being in the profanity list for remove_word()/remove_words(), and replace them with lines that pass the TravisCi testing?

RicoViking9000 commented 5 years ago

I will add code to the test file this weekend

RicoViking9000 commented 5 years ago

This code has passed the tests

DonaldTsang commented 4 years ago

2020 review: This should be added back into the main repo, since the current master has not been updated since December 2018.