Closed lucasnewman closed 11 months ago
This adds a simple Accelerate-enabled trainer class that can be used to train against audio-only data, optionally conditioned on the semantic tokens coming from the Spear-TTS wav2vec implementation. I verified the loss converges on LibriTTS-R.
boss
This adds a simple Accelerate-enabled trainer class that can be used to train against audio-only data, optionally conditioned on the semantic tokens coming from the Spear-TTS wav2vec implementation. I verified the loss converges on LibriTTS-R.