-
-
https://github.com/NVIDIA/BigVGAN?tab=readme-ov-file#news
-
I was trying to run the repo on colab:
- Inference for text-to-speech synthesis
- Inference from wav file
using the commands given in the ReadMe file but I am facing some errors that I can not mo…
-
heres a chatgpt list of effects (not exhaustive)
Compressor:
Reduces the dynamic range by attenuating peaks, providing a more even sound. Commonly used in mixing and mastering to contr…
-
Hi, im highly interested singing voice synthesis, i saw that there are pretrained models for svs but i dont understand how to specify what the model should sing eg is it possible to have it sing a tex…
ghost updated
1 month ago
-
## Describe the bug
In [models/base/new_inference.py](https://github.com/open-mmlab/Amphion/blob/57ec89cecb0fa6097d13471a03a6545145221265/models/base/new_inference.py#L145C5-L145C5)
`vocoder_cfg, …
-
I tried two differenct versions of WaveRNN vocoders where each vocoder has different output node. The first one uses RAW (softmax 512-dim, 9-bit mu-law encoded waveform sample target) and last one use…
-
Hello!
As Speech to Text models such as Whisper are added having access to some of the impressive AI Text to Speech models would be a nice way to close the loop!
My current suggestion for a model …
-
I found out that several(most) vocoders or other tts models use mel-pectrogram channel "80".
In this work, the model is using 512 channels.
why is this model using 512 channels which is way more…
-
Hello! I would like to kindly ask whether it is possible to add a custom vocoder to a TTS model on HuggingFace space? If so, how can I do that? Basically, for my TTS model that is trained on male spee…