suno-ai / bark

🔊 Text-Prompted Generative Audio Model
MIT License
35.31k stars 4.15k forks source link

Questions about the features of the history prompt #250

Closed susenyang closed 1 year ago

susenyang commented 1 year ago

I want to use my audio as the prompt, but I need to extract the corresponding semantic features, coarse features and fine features. In order to align the pre-trained models, can you tell me which version of models(hubert_kmeans, Encodec) you are using to extract features?

asr-pub commented 1 year ago

Use this fork: https://github.com/serp-ai/bark-with-voice-clone/blob/main/clone_voice.ipynb

gkucsko commented 1 year ago

unfortunately we can't release that part to prevent unauthorized voice cloning. there are some forks with approximations like the one above (not maintained by suno) that you can have a look at. But probably the cloning won't sound particularly similar to the prompt in most cases.