parlance / ctcdecode

PyTorch CTC Decoder bindings
MIT License
829 stars 245 forks source link

Custom blank symbol #82

Closed gwenniger closed 6 years ago

gwenniger commented 6 years ago

I added an extra argument to the decoders to allow specification of a custom space symbol. Currently the space symbol used by the decoder is hard-coded to be " ". This is probably fine in most cases, but it does not work for example for my problem domain of handwriting recognition, in which the word separator can be a special symbol such as "|" and the normal space symbol " " may be not used at all.