kensho-technologies / pyctcdecode

A fast and lightweight python-based CTC beam search decoder for speech recognition.
Apache License 2.0
416 stars 89 forks source link

Add dimension check during decode #45

Closed gkucsko closed 2 years ago

gkucsko commented 2 years ago

Currently we don't check logit dimension during predict. This, most commonly, can lead to issues between predict and batch_predict. For example the below will throw a non very informative error, rather than informing the user that about the incorrect dimensions of the input:

import numpy as np
from pyctcdecode.decoder import build_ctcdecoder

decoder = build_ctcdecoder(labels=["", "a", "b", "c", " "])

m = np.eye(5)
assert(decoder.decode(m) == "abc")

m = np.array([np.eye(5)])
assert(m.shape == (1, 5, 5))
decoder.decode(m)
gkucsko commented 2 years ago

closing, will be added in https://github.com/kensho-technologies/pyctcdecode/pull/52.