-
When calculating the speech features for the speech2text models, OpenSeq2Seq calculates a mean and stddev individually for each training sample. Much like batch normalization, during inference, it wou…
-
Is there any way to specify a speech recognition grammar? I am sure that deepspeech will work better, if it is used with a grammar. Am I missing something or is this not yet implemented?
A possible…
-
## ❓ Questions and Help: Fail in exporting from pt to onnx using export.py
#### What is your question?
encounter error listed below when running `python export.py'.
**The error details 'Runtime…
-
I have seen some other talk of memory leaks (#390), but I'm having a more sporadic, shorter term issue.
I've experienced this on both an RTX 4070 with 12GB VRAM and an RTX 3090 with 24GB VRAM.
`…
-
hi
I am a researcher studying EEG-To-Text. I recently saw your Neuspeech paper. I was impressed by your paper, and it was a great help to my research direction. thanks. But I have some kinds of quest…
-
The goal of this issue is to create a literature review of NLP techniques, and those that have been used in Political Economy to converge on a technique for our project.
-
你好。感谢你们公开了如此出色的项目。
我想使用你们的模型进行推理。据我所知,MPop600 数据集应该用作推理输入的数据。乐谱输入到模型中,我需要下载 MPop600 数据集,以准确了解输入到模型中的乐谱是什么并进行推理,对吧?另外,听说需要联系作者才能下载数据集,但我找遍了也没有找到联系作者的方法,所以想请问你们是否有可行的途径来获取这些数据。
谢谢你们 😃
-
I tried to do finetuning on a small dataset with 2 speakers. I set `epochs=25`, `diff_epoch=8`, `joint_epoch=15`.
The Style Diffusion training started as expected, but SLM Adversarial Training never …
-
First of all, thank you for this impressive package! I’ve encountered a possible issue when attempting to create longer audio outputs. Specifically, when I set a high word count (e.g., 5000 words) to …
-
Implement an opcode or opcodes for vocal singing synthesis in Csound, inspired by Vocaloid, Sinsy, and such.
The opcode should take marked up text and some form of vocal model, and synthesize a music…