lucidrains / voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
MIT License
589 stars 49 forks source link

Audio samples? #28

Closed blx0102 closed 11 months ago

blx0102 commented 11 months ago

Great work here! It seem you have already combined voicebox with spear-tts, could you provide some result audio samples?

lucidrains commented 11 months ago

@blx0102 Lucas already has shared some early audio samples with me. Seems to work

lucasnewman commented 11 months ago

@blx0102 It's still early days here as we dial in the training and inference, but here's an early sample with the prompt that was used for the semantic tokens. This is a 93M param model that's done about ~200k training steps @ effective batch size of 16 on LibriTTS-R.