jzlianglu / pykaldi2

Yet another speech toolkit based on Kaldi and PyTorch
MIT License
173 stars 33 forks source link

Decoding with train_transformer_ce.py #25

Closed lallubharteja closed 4 years ago

lallubharteja commented 4 years ago

Hi,

How does one decode with the models trained using train_transformer_ce.py? Is it possible to provide a decoding recipe or point to resources that can be used to build the recipe?

jzlianglu commented 4 years ago

Hi, you can modify this script to dump the loglikelihood of a transformer model,

https://github.com/jzlianglu/pykaldi2/blob/master/bin/dump_loglikes.py

and use latgen-faster-mapped or latgen-faster-mapped-parallel for decoding by reading the loglikelihoods.

Alternately, you can generate the lattices directly using this script

https://github.com/jzlianglu/pykaldi2/blob/master/bin/latgen.py

but again you need to modify the script to support transformer model (to read transformer model instead of lstm model). I will check in some decoding recipes soon. Thanks for pointing that out.

lallubharteja commented 4 years ago

Thanks for pointers!