srvk / eesen

The official repository of the Eesen project
http://arxiv.org/abs/1507.08240
Apache License 2.0
822 stars 342 forks source link

How to get the the sequence of phones after the acoustic model runned #99

Closed zhangjiulong closed 7 years ago

zhangjiulong commented 7 years ago

How to get the sequence of phones after the acoustic model runned? Thanks.

fmetze commented 7 years ago

You mean the sequence of phones (tokens) without any language model? Is it is computed during training, or validation? I can check in code that extracts IDs, frames, and posteriors for that, if that is what you want?

zhangjiulong commented 7 years ago

I think validdation is ok. In fact I have tried srvk-eesen-offline-transcriber project using my own model and it decode ok with a acoustic model and a language mode. But I want to see what the acoustic modle out before send to wfst graph.

fmetze commented 7 years ago

I just committed a change to train-ctc-parallel which adds the "--sequence-out-file" option. You can specify a file here during cross-validation, which will contain lines as follows:

utt | 6 3 0.977587 | 28 5 0.876791 | 22 6 0.62183 | 6 9 0.975757

"utt" is a placeholder for the utterance ID, you need to get it from the corresponding feature file then, every field between "|" is the ID, the frame number, and the posterior of a ctc peak

this uses the same code that is also being used during computation of the token error rate, so it is quite accurate - but you could of course tune all kinds of things like scaling posteriors, etc - this is not done here

zhangjiulong commented 7 years ago

@fmetze thanks that's what I need.