kpu / kenlm

KenLM: Faster and Smaller Language Model Queries
http://kheafield.com/code/kenlm/
Other
2.5k stars 512 forks source link

How to call Model.score inside my custom CPP binding module #354

Open kevinghst opened 3 years ago

kevinghst commented 3 years ago

Hi,

I have 2 questions: 1) I am interested in calling model.Score() inside my own PYBIND11_MODULE, like this:

#include "/home/cirrascale/kevin/kenlm/lm/model.hh"
#include <string>

using namespace std;
using namespace torch::indexing;

int test_function(torch::Tensor xs){

    lm::ngram::Model model("/home/cirrascale/kevin/10gram_ukau_sep11.bin");
    lm::ngram::State state(model.BeginSentenceState()), out_state;
    const lm::ngram::Vocabulary &vocab = model.GetVocabulary();

    string word = "can";
    auto cond_prob = model.Score(state,vocab.Index(word), out_state);

    cout << cond_prob << endl;

    return 1;
}

PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
  m.def("test_function", &test_function, "Test function");
}

What should the setup.py look like for my custom module? I have previously already did python setup.py install for the kenLM. This will be for my own separate module.

2): Will this also work for calling Model.BaseScore? I tried to compile the above code but it gave me: error: no matching function for call to ‘lm::ngram::ProbingModel::BaseScore(lm::ngram::State&, const char [4], lm::ngram::State&)’

kpu commented 3 years ago

BaseScore takes a void pointer to the state object if you want to call that. So add some & around state.

Why did you have const char [4] whenvocab.Index returns WordIndex?