freewym / espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Other
942 stars 116 forks source link

'frame_subsampling_factor' in chain e2e setup #70

Closed mohsen-goodarzi closed 3 years ago

mohsen-goodarzi commented 3 years ago

In the Kaldi setup for chain (i.e. LF-MMI), there is a parameter called frame_subsampling_factor which is always set to 3. This is used to reduce the computational burden of LF-MMI loss.
Unfortunately I was not able to spot this parameter in your chain-e2e setup (run_chain_e2e_bichar.sh).

Is it hard coded some where or it is implemented in a different way? I would be grateful if you can guide me on this.

freewym commented 3 years ago

frame subsampling can be achieved within the network itself: e.g. you can simply set the stride of the last convolutional layer to 3 to get 3x subsampling (at https://github.com/freewym/espresso/blob/master/examples/asr_wsj/run_chain_e2e_bichar.sh#L201).

mohsen-goodarzi commented 3 years ago

Oh! Now I see it!
Thank you!