open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
https://openhlt.github.io/amphion/
MIT License
4.45k stars 379 forks source link

Discrete Diffusion in NS3 #240

Closed shreeshailgan closed 2 months ago

shreeshailgan commented 2 months ago

@HeCheng0625 Could you provide pointers to papers / github repos to understand the discrete diffusion architecture used in NaturalSpeech 3?

Also, when can we expect the implementation of NaturalSpeech 3 to be released?

Thanks.

jiaqili3 commented 2 months ago

Hi @shreeshailgan, for the paper recommendation, I recommend looking at the references in the Naturalspeech3 paper. Such as maskgit. Regarding code release of Naturalspeech3 in Amphion, I believe there will be training codes of discrete diffusion release in Amphion. The FACodec codes are already in this repo. Thanks!