agatan / ctclib

A collection of utilities related to CTC
MIT License
5 stars 7 forks source link

ctclib

ctclib at crates.io .github/workflows/ci.yml

NOTE: This is currently under development.

A collection of utilities related to CTC, with the goal of being fast and highly flexible.

Features

Installation

ctclib depends on kpu/kenlm. You must install the following libraries as KenLM dependencies.

For example, if you are using Ubuntu (or some Debian based Linux), you can install them by running the following command:

apt install libboost-all-dev libeigen3-dev

Use ctclib from Rust

Currently, ctclib isn't available on crates.io, but you can use this as git dependencies.

[dependencies]
ctclib = { version = "*", git = "https://github.com/agatan/ctclib" }

Use ctclib from Python

ctclib provides python interfaces, named pyctclib. Currently, pyctclib isn't available on PyPI, but you can install this as git dependency. Ensure that you have installed cargo and libclang-dev.

pip install 'git+https://github.com/agatan/ctclib.git#egg=pyctclib&subdirectory=bindings/python'

Example

import pyctclib

decoder = pyctclib.BeamSearchDecoderWithKenLM(
    pyctclib.BeamSearchDecoderOptions(
      beam_size=100,
      beam_size_token=1000,
      beam_threshold=1,
      lm_weight=0.5,
    ),
    "/path/to/model.arpa",
    ["a", "b", "c", "_"],
)
decode.decode(log_probs)

# or you can use user-defined LM
# See pyctclib.LMProtocol