Closed kjoshi closed 5 years ago
Hi @bdewilde,
Thanks for the comments and suggestions. I've rebased this pull request onto the develop
branch and have made the other changes you suggested.
Let me know if there's anything else you'd like me to tweak.
Description
doc_extensions.to_bag_of_words
has be modified to include additional flags that enable the user to decide whether or not to include/exclude stop words, punctuation and/or spaces from the word counts.Corpus.words_counts
andCorpus.word_doc_counts
have also been updated to pass through the relevant flags todoc_extensions.to_bag_of_words
.Motivation and Context
Sometimes it may be of interest to keep track of stop words, punctuation and/or spaces when converting a doc to a bag of words.
How Has This Been Tested?
Types of changes
Checklist: