open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
https://openhlt.github.io/amphion/
MIT License
5.91k stars 452 forks source link

Add support of visualization #141

Closed lmxue closed 8 months ago

lmxue commented 8 months ago

✨ Description

We release the Visualization Tool, a unique feature in Amphion, designed to visually analyze classical audio, music, and speech generation models for educational purposes. It offers an interactive experience for beginners, engineers, and researchers alike, enabling them to explore and understand the inner workings of generative models.

Currently, Amphion supports a visualization tool of the diffusion model for singing voice conversion, named SingVisio. The paper of SingVisio: Visual Analytics of the Diffusion Model for Singing Voice Conversion is available now. Additionally, SingVisio tool can be experienced here.

SingVisio Demo

https://github.com/open-mmlab/Amphion/assets/33707885/0a6e39e8-d5f1-4288-b0f8-32da5a2d6e96

🚧 Related Issues

124

👨‍💻 Changes Proposed

🧑‍🤝‍🧑 Who Can Review?

@yuantuo666 @RMSnow

🛠 TODO

None

✅ Checklist