-
Dear Paveel,
Thanks a lot for releasing the supplementary material at https://arxiv.org/pdf/2203.13086v4.pdf and releasing the architecture code at this repo.
However, I don't find in the Githu…
-
i removed the postnet(remove the code of model and loss about postnet ) and set the pitch_quantization="log",set features of pitch and enery = "frame_level", normalization="False",and other configurat…
-
请问一下哪里能看到相关实现呢?
-
First of all, thank you for the project, I think it is really useful, especially that the official NVIDIA implementation is not released yet!
Did you manage to train the model to satisfactory quali…
-
See if there is a way to run text to speech and read the generated text out loud.
Ideally, this happens in parallel to the LLM generating tokens (using the [second CPU core](https://coral.ai/docs/d…
-
With the rise of fast vector databases for doing approximate nearest neighbors (FLANN, annoy, chroma, milvius, weaviate, etc.), it becomes increasingly useful to have vectorial representations of audi…
-
Hi there,
I can't get the larynx.text_to_speech python function to work. I'm getting these errors and then the audio that plays is just noise:
```
2023-04-03 14:51:03.305121301 [W:onnxruntime:, e…
-
Dear Authors and reader
I would appreciate it if you would give me an answer to my question:
Is it possible to convert the spectrogram image (not the array) to wav (reconstruct the wav audio from th…
-
Hi devs. I am not a dev, but I'm suggesting making this tts system available for assistive tech on windows. This could mean 1 of 2 things, 1 of them is crossprogram and the other is for a specific pro…
-
I'd like to train hifi-gan on a custom dataset with its own set of wav files. For this I need to generate the corresponding mel spectrograms, which the readme says can be done using Tacotron2 although…