Goskov Denis 3.7
Module Text_preprocessor which has two functions :
1) the first one (words_in_corpus_frequency) gets corpus - list of tokenized texts (list of list of words) and returns a dictionary where each word will matched with it frequency in list
2) the second one (delete_words_with_high_and_low_frequency) deletes words with very high and low frequency in input corpus and returns changed corpus.
Goskov Denis 3.7 Module Text_preprocessor which has two functions : 1) the first one (words_in_corpus_frequency) gets corpus - list of tokenized texts (list of list of words) and returns a dictionary where each word will matched with it frequency in list 2) the second one (delete_words_with_high_and_low_frequency) deletes words with very high and low frequency in input corpus and returns changed corpus.