lucidrains / naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
MIT License
1.26k stars 100 forks source link

audio codec and the diffusion model are trained together? #42

Open BumbleStone opened 2 months ago

BumbleStone commented 2 months ago

it seems that the audio codec and the diffusion model are trained together, not trained separately as mentioned in the paper.