kpu / kenlm

KenLM: Faster and Smaller Language Model Queries
http://kheafield.com/code/kenlm/
Other
2.51k stars 511 forks source link

Add access to vocabulary in python bindings #30

Open cypreess opened 9 years ago

cypreess commented 9 years ago

It would be nice to have access to kenlm.LanguageModel.vocab or even (maybe more pytonic way) to support iterable protocol on kenlm.LanguageModel.

kpu commented 9 years ago

Would a callback from LoadVirtual be sufficient?

kpu commented 9 years ago

The C++ side doesn't even remember the vocabulary strings by default because users either don't need it or have their own data structure populated by the EnumerateVocab callback API.

cypreess commented 9 years ago

I must say I did not read very deeply into the implementation. Just wondering if it's easy to implement access vocabulary somehow.

manishbansal-fk commented 6 years ago

@kpu Is there any way we can access LanguageModel vocab from python wrapper. I am loading model as kenlm.Model(model.klm) in python. "model.klm" is built from command line.