mmcs-ruby / sentiment

MIT License
0 stars 8 forks source link

TextPreprocessor #11

Closed denis46-g closed 2 years ago

denis46-g commented 2 years ago

Goskov Denis 3.7 Module Text_preprocessor which has two functions : 1) the first one (words_in_corpus_frequency) gets corpus - list of tokenized texts (list of list of words) and returns a dictionary where each word will matched with it frequency in list 2) the second one (delete_words_with_high_and_low_frequency) deletes words with very high and low frequency in input corpus and returns changed corpus.