VALL-E new verson release - Githubissues

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

https://openhlt.github.io/amphion/

MIT License

4.28k stars 365 forks source link

VALL-E new verson release #223

Closed jiaqili3 closed 3 weeks ago

jiaqili3 commented 3 weeks ago

✨ Description

adding more readme info of our new VALL-E implementation, and align VALLE V2 folder name with other Amphion files.

🚧 Related Issues

None

👨‍💻 Changes Proposed

- [x] We have changed the underlying implementation to Llama to provide better model performance, faster training speed, and more readable codes.
- [x] We provide more detailed README.md for reproducing our models with pretrained weights, training on LibriTTS, and future plans on improving the model.
- [x] We use a refined codec model name SpeechTokenizer as the codec, yielding better modeling quality than the original Encodec

🧑‍🤝‍🧑 Who Can Review?

@RMSnow

✅ Checklist

- [x] Code has been reviewed
- [x] Code complies with the project's code standards and best practices
- [x] Code has passed all tests
- [x] Code does not affect the normal use of existing features
- [x] Code has been commented properly
- [x] Documentation has been updated (if applicable)
- [x] Demo/checkpoint has been attached (if applicable)