Add support for Bark voice cloning

gukush / audio-watermark-242

Repository for research project about watermarkng audio

3 stars 0 forks source link

Add support for Bark voice cloning #6

Open gukush opened 1 month ago

gukush commented 1 month ago

Initial links https://github.com/suno-ai/bark https://github.com/serp-ai/bark-with-voice-clone https://www.reddit.com/r/singularity/comments/12udgzh/bark_text2speechbut_with_custom_voice_cloning/

This task aims to analyze above solutions from voice cloning perspective and implement the technique.

gukush commented 3 weeks ago

In order for bark-with-voice-clone repository following things need to be done:

Install fairseq from source: https://github.com/facebookresearch/fairseq in source change in "setup.py" : remove versions requirements for omegaconf and hydra-core pip install audiolm-pytorch

Keep in mind Bark has TTS capabilities and does not do voice -> voice transformations.

gukush commented 2 weeks ago

For some reason now fairseq does not want to load the model - it is incompatible with Python 3.12. The repository will have to be changed from official one to some recent fork.

gukush commented 2 weeks ago

For now added extraction of semantic tokens. What is still needed are coarse, fine and other models from bark in order to generate the audio.