-
The encoder interface is quite trivial, basically just any `[LayerRef] -> LayerRef` function, although the interface also should imply the tensor format {B,T,D} or so.
The idea was to have a generi…
-
I have a question. You are using the structure of the transformer, not the structure in the transformer-transducer paper. And the loss function uses ctc instead of rnnt
-
Hello! I'm wondering if is possible to use those models in real-time inference scenario like microphone stream. In other words, will models work good on smaller chunks of audio like 250ms instead on w…
-
First of all, thanks for this amazing work in benchmarking the several available RNNT implementations. This is more of a "discussion" rather than an issue.
I am sure you are aware about this, but t…
-
I am try to use this implementation with apex half precision training, but it can't.
showing that it need float rather that half:
______________
File "/data/asr_v3/src/model/transformer_transduc…
-
Is there already an example for this using ONNY
-
Is it possible to output token level timestep?
eg:
hello 100-600
world 712-900
.......
Mddct updated
3 years ago
-
## Title & Topic
- sequence prediction 분야에서의 transduction, transductive learning 개념과 방법론을 알아본다
- transducer로서의 RNN 개념을 파악한다
- transduction에서 파생된 트랜스포머 네트워크 개념을 이해한다
## Upload schedule
- [x]…
-
I have trained a Conformer model using my own custom dataset in Thai. However, GPU Utilization seems to be pretty low as the training speed is pretty slow (~2 s/batch). The GPU was utilized by around …
-
**Describe the bug**
A clear and concise description of what the bug is.
**Basic environments:**
- OS information: e.g., Linux 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64
…