Closed adriens closed 1 year ago
I also want to know
I could made it using the GPT as assistant:
!sudo apt-get install openjdk-8-jdk !echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list !curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add - !sudo apt-get update && sudo apt-get install bazel
!sudo apt-get install python3.7 python3.7-dev python3.7-tk
!pip3 install virtualenv==16.7.8
!sudo apt-get install gcc-7 g++-7 !sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 60 --slave /usr/bin/g++ g++ /usr/bin/g++-7 !sudo update-alternatives --config gcc
!git clone https://github.com/suno-ai/bark.git
!apt-get install python
!apt-get install python-pip
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
from IPython.display import Audio import nltk # we'll use this to split into sentences import numpy as np
from bark.generation import ( generate_text_semantic, preload_models, ) from bark.api import semantic_to_waveform from bark import generate_audio, SAMPLE_RATE
script = """ Here comes your script to be spoken. """.replace("\n", " ").strip()
import nltk
!pip install nltk import nltk nltk.download('punkt')
GEN_TEMP = 0.6 SPEAKER = "v2/pt_speaker_3" silence = np.zeros(int(0.25 * SAMPLE_RATE)) # quarter second of silence
pieces = [] for sentence in sentences: semantic_tokens = generate_text_semantic( sentence, history_prompt=SPEAKER, temp=GEN_TEMP, min_eos_p=0.05, # this controls how likely the generation is to end )
audio_array = semantic_to_waveform(semantic_tokens, history_prompt=SPEAKER,)
pieces += [audio_array, silence. Copy()]
Audio(np.concatenate(pieces), rate=SAMPLE_RATE)
write_wav("bark_generation.wav", SAMPLE_RATE, audio_array)
Audio(audio_array, rate=SAMPLE_RATE)
if you mean just voice cloning with an existing model then we don't support that for now, but there are some forks where people have gone in that direction. as for training/finetuning there is a bunch of chatter happening on discord with people working on that
@danielklk , what I meant was to learn how to clone or create a new voice :smile_cat:
if you mean just voice cloning with an existing model then we don't support that for now, but there are some forks where people have gone in that direction. as for training/finetuning there is a bunch of chatter happening on discord with people working on that
Yes @gkucsko , thatt's what I meant.
So if I undesrtsand well, we cannot really clone a real life voice, but rather modifying and fine tuning exosting voices. I've read some chats on Dicord but could not find a central place for guidelines. I found some really nice sounding voices made by community :open_mouth:
if you mean just voice cloning with an existing model then we don't support that for now, but there are some forks where people have gone in that direction. as for training/finetuning there is a bunch of chatter happening on discord with people working on that
Yes @gkucsko , thatt's what I meant.
So if I undesrtsand well, we cannot really clone a real life voice, but rather modifying and fine tuning exosting voices. I've read some chats on Dicord but could not find a central place for guidelines. I found some really nice sounding voices made by community 😮
Could you share them please?
I would like to learn how to create custom models (for example my own voice) but I'm lacking some documentation to achieve this. :pray: