Open Eddcapone opened 1 year ago
I've figured out, that I have to navigate to the folder were my test.py script lays and then call python test.py
, it will behave completly different than calling it while being in another directory.
It installs some packages but then it just outputs:
_Loading the tokenizer from the
special_tokens_map.json
and theadded_tokens.json
will be removed intransformers 5
, it is kept for forward compatibility, but it is recommended to update yourtokenizer_confi g.json
by uploading it again. You will see the newadded_tokens_decoder
attribute that will store the relevant information. The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input'sattention_mask
to obtain reliable results. Settingpad_token_id
toeos_token_id
:10000 for open-end generation._
Then I extended the script like instructed and installed ipython with pip install ipython
. But if I call it, then I still get no audio output.
from transformers import AutoProcessor, BarkModel
from IPython.display import Audio
processor = AutoProcessor.from_pretrained("suno/bark")
model = BarkModel.from_pretrained("suno/bark")
voice_preset = "v2/en_speaker_6"
inputs = processor("Hello, my dog is cute", voice_preset=voice_preset)
audio_array = model.generate(**inputs)
audio_array = audio_array.cpu().numpy().squeeze()
sample_rate = model.generation_config.sample_rate
Audio(audio_array, rate=sample_rate)
I also tried this:
from transformers import AutoProcessor, BarkModel
from IPython.display import Audio
import scipy
processor = AutoProcessor.from_pretrained("suno/bark")
model = BarkModel.from_pretrained("suno/bark")
voice_preset = "v2/en_speaker_6"
inputs = processor("Hello, my dog is cute", voice_preset=voice_preset)
audio_array = model.generate(**inputs)
audio_array = audio_array.cpu().numpy().squeeze()
#sample_rate = model.generation_config.sample_rate
#Audio(audio_array, rate=sample_rate)
sample_rate = model.generation_config.sample_rate
scipy.io.wavfile.write("bark_out.wav", rate=sample_rate, data=audio_array)
But no output file is generated.
Please fix the instructions
OK! So I figured it out. I had to move the test.py script into the folder "bark" which they apparently refer to as "main" folder.
Then this script works:
from transformers import AutoProcessor, BarkModel
import scipy
processor = AutoProcessor.from_pretrained("suno/bark")
model = BarkModel.from_pretrained("suno/bark")
voice_preset = "v2/en_speaker_6"
inputs = processor("Hello, my dog is cute", voice_preset=voice_preset)
audio_array = model.generate(**inputs)
audio_array = audio_array.cpu().numpy().squeeze()
sample_rate = model.generation_config.sample_rate
scipy.io.wavfile.write("bark_out.wav", rate=sample_rate, data=audio_array)
It will take some time and there is no feedback, but after a while it will eventually generate the bark_out.wav in the same folder as the script.
This is the worst usage instructions I ever came across. smh...
Did you manually downloaded the models and if so. Which folder did you add them into?
@CodeRippleDatabase What models? I just did everything like described above.
This is a bit confusing but those are two seperate Bark implementation. You can use either:
from transformers import AutoProcessor, BarkModel
git clone https://github.com/suno-ai/bark
However to make things even more confusing, Suno bark requires HuggingFace transformers, which means you basically install both versions. But if you aren't using the Suno code you can skip the Suno part and just do the huggingface.
Good job!
First I installed bark:
Then inside of the bark folder I installed transformers:
pip install git+https://github.com/huggingface/transformers.git
Then I created this python script named test.py from inside the main folder (bark) as instructed:
Then I run it, but a console window opens for a split second and nothing happens.
This is what I get when I run the script with python:
$ python /a/AI/Text-To-Speech/suno/bark/test.py Traceback (most recent call last): File "A:\AI\Text-To-Speech\suno\bark\test.py", line 3, in
processor = AutoProcessor.from_pretrained("suno/bark")
File "C:\Users\Edd\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\auto\processing_auto.py", line 258, in from_pretrained
config = AutoConfig.from_pretrained(
File "C:\Users\Edd\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\auto\configuration_auto.py", line 1032, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, kwargs)
File "C:\Users\Edd\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\configuration_utils.py", line 620, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, kwargs)
File "C:\Users\Edd\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\configuration_utils.py", line 675, in _get_config_dict
resolved_config_file = cached_file(
File "C:\Users\Edd\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\hub.py", line 400, in cached_file
raise EnvironmentError(
OSError: suno/bark does not appear to have a file named config.json. Checkout 'https://huggingface.co/suno/bark/None' for available files.