speech-language-model Search Results

NVIDIA/OpenSeq2Seq #422

Speech features not using dataset statistics

When calculating the speech features for the speech2text models, OpenSeq2Seq calculates a mean and stddev individually for each training sample. Much like batch normalization, during inference, it wou…

jangalt updated 5 years ago

mozilla/DeepSpeech #1290

Specify grammar file SRGS

Is there any way to specify a speech recognition grammar? I am sure that deepspeech will work better, if it is used with a grammar. Am I missing something or is this not yet implemented? A possible…

NicoHood updated 6 years ago

FunAudioLLM/SenseVoice #78

[about ONNX export]

## ❓ Questions and Help: Fail in exporting from pt to onnx using export.py #### What is your question? encounter error listed below when running `python export.py'. **The error details 'Runtime…

LateLinux updated 1 month ago

SYSTRAN/faster-whisper #442

(WSL2) RuntimeError: CUDA failed with error out of memory

I have seen some other talk of memory leaks (#390), but I'm having a more sporadic, shorter term issue. I've experienced this on both an RTX 4070 with 12GB VRAM and an RTX 3090 with 24GB VRAM. `…

coder543 updated 1 week ago

NeuSpeech/NeuSpeech1 #1

I Have a some questions

hi I am a researcher studying EEG-To-Text. I recently saw your Neuspeech paper. I was impressed by your paper, and it was a great help to my research direction. thanks. But I have some kinds of quest…

girlsending0 updated 1 month ago

Houdanait/PoliticalTextandAttitudes #2

Do a litterature review of NLP in PolEcon and converge on a …

The goal of this issue is to create a literature review of NLP techniques, and those that have been used in Political Economy to converge on a technique for our project.

Houdanait updated 6 months ago

york135/zero_shot_svs_ASRU2023 #1

如何获取 MPop600 数据集？

你好。感谢你们公开了如此出色的项目。我想使用你们的模型进行推理。据我所知，MPop600 数据集应该用作推理输入的数据。乐谱输入到模型中，我需要下载 MPop600 数据集，以准确了解输入到模型中的乐谱是什么并进行推理，对吧？另外，听说需要联系作者才能下载数据集，但我找遍了也没有找到联系作者的方法，所以想请问你们是否有可行的途径来获取这些数据。谢谢你们 😃

MuHyeonSon updated 2 months ago

yl4579/StyleTTS2 #227

SLM Adversarial Training did not start when finetuning

I tried to do finetuning on a small dataset with 2 speakers. I set `epochs=25`, `diff_epoch=8`, `joint_epoch=15`. The Style Diffusion training started as expected, but SLM Adversarial Training never …

godspirit00 updated 4 months ago

souzatharsis/podcastfy #166

Limitation on Podcast Length with Higher Word Counts

First of all, thank you for this impressive package! I’ve encountered a possible issue when attempting to create longer audio outputs. Specifically, when I set a high word count (e.g., 5000 words) to …

KlaasYntema updated 4 days ago

csound/csound #658

Singing synthesis opcodes

Implement an opcode or opcodes for vocal singing synthesis in Csound, inspired by Vocaloid, Sinsy, and such. The opcode should take marked up text and some form of vocal model, and synthesize a music…

gogins updated 3 hours ago

1000+ results for speech-language-model

1000+ results
for speech-language-model