kensho-technologies / pyctcdecode

A fast and lightweight python-based CTC beam search decoder for speech recognition.
Apache License 2.0
421 stars 89 forks source link

Error with tutorial examples #58

Closed pkadambi closed 2 years ago

pkadambi commented 2 years ago

Hi, I'm using the 4-gram.arpa from kenlm, when I run

import kenlm

from pyctcdecode import build_ctcdecoder
kenlm_model = kenlm.Model("./4-gram.arpa")
labels = [
    " ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l",
    "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z",
]
decoder = build_ctcdecoder(
    labels,
    kenlm_model,
    alpha=0.5,  # tuned on a val set
    beta=1.0,  # tuned on a val set
)

I get the following error with build_ctcdecoder

TypeError: Cannot convert <class 'kenlm.Model'> to string

How do I resolve this?

gkucsko commented 2 years ago

build_ctcdecoder takes a model path, not a model I believe

DanBmh commented 2 years ago

Had the same problem, fixed it with #62

lopez86 commented 2 years ago

The PR is merged now, so I'm going to close this. Thanks!