kensho-technologies / pyctcdecode

A fast and lightweight python-based CTC beam search decoder for speech recognition.
Apache License 2.0
415 stars 89 forks source link

Shape of logits #114

Closed pranav-chandrode closed 8 months ago

pranav-chandrode commented 10 months ago

What should be the shape of logits? I the shape of my output is [64,29] (time, num_classes). My decoder is beam_decoder = build_ctcdecoder(labels= labels, kenlm_model_path=None) But when I used text = beam_decoder.decode(out) I am getting this error "TypeError: max() received an invalid combination of arguments - got (axis=int, keepdims=bool, out=NoneType, ), but expected one of:

Could someone help me with this. Thanks!

eyenpi commented 9 months ago

Make sure that your out is numpy array. I solved this issue by adding out.numpy() as my output was torch tensor.

pranav-chandrode commented 8 months ago

Thank you, it worked.