danpovey / pocolm

Small language toolkit for creation, interpolation and pruning of ARPA language models
Other
90 stars 48 forks source link

Insert Unigram weights option to word_counts_to_vocab.py #97

Closed saikiranvalluri closed 5 years ago

saikiranvalluri commented 5 years ago

This option is useful when training POCOLM based on multiple Gigaword text corpora and weighing their unigram counts according to their relative importance with respect to the dev text.