gitmylo / bark-voice-cloning-HuBERT-quantizer

The code for the bark-voicecloning model. Training and inference.
MIT License
671 stars 111 forks source link

generate semantic tokens from wavs #5

Closed eschmidbauer closed 1 year ago

eschmidbauer commented 1 year ago

Thanks for sharing this code. I've run through the steps and it appears to generate the semantic tokens from text, and then generates the wav files from the semantic tokens. But Is it possible to generate the semantic tokens from a set of wav files?

gitmylo commented 1 year ago

This code extracts semantic tokens from a wav file. not text.

This is wav->semantic tokens, like what you're looking for.

Which part of the code makes you think it uses text?

eschmidbauer commented 1 year ago

Im sorry, i meant to open this issue on https://github.com/gitmylo/bark-data-gen