Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
✨ Description
The PR adds the PicoAudio into the Amphion toolkit.
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
repo: https://github.com/zeyuxie29/PicoAudio
paper: https://arxiv.org/abs/2407.02869v2
demo: https://zeyuxie29.github.io/PicoAudio.github.io/
huggingface spcae: https://huggingface.co/spaces/amphion/PicoAudio
🚧 Related Issues
[List the issue numbers related to this PR]
👨💻 Changes Proposed
🧑🤝🧑 Who Can Review?
@zhizhengwu @HeCheng0625
🛠 TODO
✅ Checklist