such as word/phone start time, end time, confidence, etc.
tz301 updated
3 years ago
Augmented streams (which have stacked frames shifted) create _n_ copies of the input. At test time the logit streams are averaged together. However, this is buggy under CTC training, as the blank la…
Hi all,
I've seen the new tf_clean branch is available, so be trying to use it. I'm using ```swbd/v1-tf``` recipe and was able to train it successfully. However, I cannot find any script for single w…
Error while make in cmake fashion. Complains
> cuda_compile_generated_ctc_loss_layer.cu.o' failed
cmake Log
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identificati…
The hparams.py says `n_frames_per_step=1, # currently only 1 is supported`, but reduction window is very important for them model to pick up alignment. Using a reduction window can be considered as d…
Can you please tell me what layers are different from original caffe?
I think this addition add CtcLoss layer and ContinuationIndicator layer. Is that right?
Where is the cpp file of the added layer…
- https://arxiv.org/abs/2104.07787
- 2021
e4exp updated
3 years ago
I am new to Icefall. I would like to extract framewise alignment information like what is shown in #39 with the pretrained model from: https://huggingface.co/csukuangfj/icefall-asr-librispeech…