Open gukush opened 1 month ago
In order for bark-with-voice-clone repository following things need to be done:
Install fairseq from source: https://github.com/facebookresearch/fairseq in source change in "setup.py" : remove versions requirements for omegaconf and hydra-core pip install audiolm-pytorch
Keep in mind Bark has TTS capabilities and does not do voice -> voice transformations.
For some reason now fairseq does not want to load the model - it is incompatible with Python 3.12. The repository will have to be changed from official one to some recent fork.
For now added extraction of semantic tokens. What is still needed are coarse, fine and other models from bark in order to generate the audio.
Initial links https://github.com/suno-ai/bark https://github.com/serp-ai/bark-with-voice-clone https://www.reddit.com/r/singularity/comments/12udgzh/bark_text2speechbut_with_custom_voice_cloning/
This task aims to analyze above solutions from voice cloning perspective and implement the technique.