-
I'm trying to learn a regression problem. The data is mostly one-hot encoded categorical variables, one continuous. The target output is a probability (0-1). Here is the code:
```
def read_lines(f…
-
参考该链接方法:
https://github.com/wenet-e2e/wenet/pull/1982
先尝试用作者提供训好模型、配置文件以及热词列表解码,在librispeech上test other wer为9.5%。(作者给出的结果为7.93%)
然后自己训练了一版,batch size 调整到32,以及batch num contex也调整到32, 迭代了29个ep…
-
时间戳结束时间大部分停在语音片段上,而不是静音处,如图:
使用模型: damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch
调用方式: modelscope pipeline
测试音频:[test-funasr.wav.zip](https://github.com/ali…
-
The `install_models.sh` file downloads 3 files, one is the `blstm.ep7.9langs-v1.bpej20k.model.py` file and the other two are `ep7.9langs-v1.bpej20k.bin.9xx` & `ep7.9langs-v1.bpej20k.codes.9xx`. `mlenc…
-
您好 想請問會release model?
我正在實現這篇論文,使用Voice Bank + DEMAND 資料集訓練但出來結果相差很多...
另外論文當中這部分有點奇怪,在equation (2)所表示的是elementwise adding,但在圖上最後卻是concat,這部分是依照哪個為主?
如果您願意回覆,非常感謝
-
Hi there,
Currently I had an idea of `Word-Level training`, basically it focuses on word-level recognition rather than the line-level & character-level that Tesseract is currently using.
The idea…
ghost updated
10 months ago
-
OS: Windows 10
Python/C++ Version:python 3.8.17
Package Version:pytorch==1.11.0、torchaudio==0.11.0、modelscope==1.8.1、funasr==0.7.3(pip list)
Model:damo/speech_paraformer-large-vad-punc_asr_nat-zh-c…
-
I f you want to use a pre trained `Transformer` for the same task, how would you use it instead of `LSTM` here? For example I want to use a lightweight `BERT` model, what would ne the changes to the l…
-
Hi everyone!
I ran the recipe given for ST on Must-C V1 (en-de) with the train_rnn configuration. The ACC goes up to 56% on the train after 14 epochs, but the validation BLEU remains at 0 (see scre…
-
We have been developing (enhancing) the ESPnet2 for the speaker diarization task. The [base code](https://github.com/espnet/espnet/pull/2939) follows the [EEND](https://github.com/hitachi-speech/EEND)…