I'm not an expert in NLTK, but I tried following the algorithm and I don't understand how it can work.
It seems _build_frequency_dist is supposed to count frequency of phrases. However, the phrase_list it receives is the one generated by _generate_phrases which returns a set(), which means every phrase can only appear there once.
The generated Counter object counts every phrase as appearing once.
I'm not an expert in NLTK, but I tried following the algorithm and I don't understand how it can work.
It seems
_build_frequency_dist
is supposed to count frequency of phrases. However, thephrase_list
it receives is the one generated by_generate_phrases
which returns aset()
, which means every phrase can only appear there once.The generated
Counter
object counts every phrase as appearing once.This doesn't make sense no?