-
Currently the model uses universal hifi gan to produce waveforms, other vocoders can also be used.
Vocos uses 100bin melspec, this model is trained on 80 bin channels for spectogram
-
Hi @KdaiP nice work, just like to know is this architecture is intended to support zero-shot TTS or normal multi-speaker kind of TTS,
-
```
File "....\MARS5-TTS\./mdl\hub\Camb-ai_mars5-tts_master\inference.py", line 291, in tts
final_audio = self.vocode(final_output).squeeze()
^^^^^^^^^^^^^^^^^^^^^^^^^
File…
-
Hi
this is a great work. I noticed that the ISTFT head does not use the same padding that you implemented, but uses the center padding which is provided by torch api according your config file.
for…
nukes updated
2 months ago
-
do you have some demo? In my experiment, this model is lower than original vits? maybe the input of vocos must some speech feature(eg, mel)? can you tell me your model performance or other improvem…
-
The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See htt…
-
Hi , thanks for your work. I would like to ask if Vocos can decode with streaming when reconstructing audio from EnCodec tokens?
-
Hi, thank you for the great project you have made available!
I added it to my one click installed package of AI based audio generators. [Link](https://github.com/rsxdalv/tts-generation-webui/)
H…
-
Hi @hubertsiuzdak, I am trying to figure out how to convert my ckpt to pytorch_model.bin so that I can load model by vocos.pretrained or any idea to load ckpt for inferencing directly?
-
as mentioned in paper, will you provide pretrained weight of model?
also, reconstruction from encodec tokens using [vocos](https://github.com/charactr-platform/vocos) may boost quality of audio res…