Open gordonda opened 8 years ago
@gordonda If you see the result of the above implemented 1.4 example:
all_words = nltk.FreqDist(w.lower() for w in movie_reviews.words()) word_features = list(all_words)[:2000]
the all_words
already contains the frequencies in the sorted order(descending
) here as FreqDist
buy defalut arrange them in that order.
So please run the code again and close this issue if satisfied.
It seems there is an error in the implementation of example 1.4 of chapter 6. The explanation in the text states that the 2000 most frequent words are to be extracted. The code given for this is:
but this will return the words to appear first in FreqDist, not necessarily the most frequent. One solution may be to replace the second line of code above with the following line:
word_features = [w for w,freq in all_words.most_common(2000)]