Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
We release the Visualization Tool, a unique feature in Amphion, designed to visually analyze classical audio, music, and speech generation models for educational purposes. It offers an interactive experience for beginners, engineers, and researchers alike, enabling them to explore and understand the inner workings of generative models.
✨ Description
We release the Visualization Tool, a unique feature in Amphion, designed to visually analyze classical audio, music, and speech generation models for educational purposes. It offers an interactive experience for beginners, engineers, and researchers alike, enabling them to explore and understand the inner workings of generative models.
Currently, Amphion supports a visualization tool of the diffusion model for singing voice conversion, named SingVisio. The paper of SingVisio: Visual Analytics of the Diffusion Model for Singing Voice Conversion is available now. Additionally, SingVisio tool can be experienced here.
SingVisio Demo
https://github.com/open-mmlab/Amphion/assets/33707885/0a6e39e8-d5f1-4288-b0f8-32da5a2d6e96
🚧 Related Issues
124
👨💻 Changes Proposed
egs/visualization
directory to introduce Amphion's visualization feature.egs/visualization/SingVisio
directory to introduce SingVisio.Amphion/visualization/SingVisio
directory.🧑🤝🧑 Who Can Review?
@yuantuo666 @RMSnow
🛠 TODO
None
✅ Checklist