agatan / ctclib

A collection of utilities related to CTC
MIT License
5 stars 7 forks source link

Adding features to KenLM binding or changing API #8

Open Uinelj opened 2 years ago

Uinelj commented 2 years ago

Hello,

While I know that the ctclib-kenlm-sys crate is not a general purpose binding library, I'm currently in need of such a binding and your work is a good starting point.

I have a first question, related to KenLM's new method that requires a dict. I have used kenlm's query binary and did not have to use this to do queries. So I forked your repository and made some changes (making Model public, using base_score and then computing perplexity) and on the (tiny) samples that I have tried, I got the same results than the query binary.

So I'm wondering if I should try to add these changes (in a non breaking way, obviously) insode this lib or if you'd prefer me to fork the ctclib-kenlm-sys and make something more opinionated.

Let me know what you prefer, or if there's a way of using the KenLM struct in the way I intend to do, without changing code :)

Uinelj commented 1 year ago

I have pushed my changes on my local fork here, and I'll publish it on crates.io so that I can use it in my projects. If you are interested in me PRing my changes to your repo let me know!