Closed jiaqili3 closed 3 months ago
Do we have any pretrained models or demo for this new valle?
Do we have any pretrained models or demo for this new valle?
It has been detailed in the readme file in egs/tts/valle_v2, and the demo.ipynb has also been uploaded to run inference with pretrained weights
Hi @RMSnow , thanks for your review! I've updated the code and your previous review questions have been resolved.
Hi @jiaqili3, please update the demo.ipynb. Others look good to me.
Updated. Thanks @RMSnow
✨ Description
In this PR, we release an unofficial PyTorch implementation of VALL-E, a zero-shot voice cloning model via neural codec language modeling. If trained properly, this model could match the performance specified in the original paper. This is a refined version compared to the first version of VALLE in Amphion, we have changed the underlying implementation to Llama to provide better model performance, faster training speed, and more readable codes. This can be a great tool for users who want to learn speech language models and its implementation.
🚧 Related Issues
None
👨💻 Changes Proposed
🧑🤝🧑 Who Can Review?
@HeCheng0625 @RMSnow @HarryHe11 @zhizhengwu
✅ Checklist