Closed eschmidbauer closed 1 year ago
This code extracts semantic tokens from a wav file. not text.
This is wav->semantic tokens, like what you're looking for.
Which part of the code makes you think it uses text?
Im sorry, i meant to open this issue on https://github.com/gitmylo/bark-data-gen
Thanks for sharing this code. I've run through the steps and it appears to generate the semantic tokens from text, and then generates the wav files from the semantic tokens. But Is it possible to generate the semantic tokens from a set of wav files?