Bark - Expected all tensors to be on the same device, but found cpu and cuda

Describe the bug Both bark and bark-small get the tensors are on cpu and cuda error, I don't have experience in bark programs, but I did try commenting out the enable_cpu_offload() in bark.py line 75 but same error remains.

Console log

[Bark 🗣️ ]: starting module The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:10000 for open-end generation. Traceback (most recent call last): File "C:\Users\XXX\biniou\env\Lib\site-packages\gradio\queueing.py", line 407, in call_prediction output = await route_utils.call_process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\gradio\route_utils.py", line 226, in call_process_api output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\gradio\blocks.py", line 1550, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\gradio\blocks.py", line 1185, in call_function prediction = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\anyio_backends_asyncio.py", line 2177, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\anyio_backends_asyncio.py", line 859, in run result = context.run(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\gradio\utils.py", line 661, in wrapper response = f(args, kwargs) ^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\gradio\utils.py", line 661, in wrapper response = f(*args, *kwargs) ^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\ressources\common.py", line 573, in wrap_func result = func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\ressources\bark.py", line 80, in music_bark audio_array = pipe_bark.generate(inputs, do_sample=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\transformers\models\bark\modeling_bark.py", line 1712, in generate semantic_output = self.semantic.generate( ^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\transformers\models\bark\modeling_bark.py", line 899, in generate semantic_output = super().generate( ^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\transformers\generation\utils.py", line 1989, in generate result = self._sample( ^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\transformers\generation\utils.py", line 2942, in _sample next_token_scores = logits_processor(input_ids, next_token_logits) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\transformers\generation\logits_process.py", line 98, in call scores = processor(input_ids, scores) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\biniou\env\Lib\site-packages\transformers\generation\logits_process.py", line 1836, in call suppress_token_mask = torch.isin(vocab_tensor, self.suppress_tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument test_elements in method wrapper_CUDA_isin_Tensor_Tensor)

Hardware (please complete the following information):

RAM size: [e.g. 16GB] 32GB
Processor physical core number or vcore number : [e.g. 4] 6
Storage type of biniou installation : [e.g. HDD, SSD, NVMe]2T M.2 SSD
Free storage space for biniou installation : [e.g. 50GB] 1.5T
GPU : [e.g. Nvidia Geforce RTX 3090, AMD Radeon RX 7800 XT] Nvidia GTX 4070Ti 12GB

Additional informations

I access the webui through the ip 127.0.0.1 (browser and biniou installation on the same system) : [] Yes [ X] No

Hello @vincent0408,

Thanks for reporting this bug and your interest in the project.

I'll try to fix it asap.

@vincent0408,

Commit af11510 may fix this issue : as I don't have access to CUDA hardware, I can't validate that it works, but I've tested on CPU and it's worked fine.

Can you confirm that it solve the issue on your side ?

Sorry for the late reply, I can confirm that moving the tensors in the commit does solve the tensor device problems. The next problem occurs at line 84 when saving the wav file, as scipy.io.wavfile.write seems to dislike float16, so I changed the line to write_wav(savename, sample_rate, audio_array.astype(np.float32)), and I get a working bark output. Don't know if float32 would be your choice as well, I just simply chose a dtype that would work.

Also, I really enjoyed your project and wanted to share some experience when installing as a Windows user. The error that I faced was the version conflict between pytorch and xformers. After a clean setup, xformers will complain WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.1.0+cu121 with CUDA 1202 (you have 2.1.0+cpu) Python 3.10.12 (you have 3.10.12) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Somehow during the installation, the cpu version of pytorch is installed, so one simply needs to navigate under the virtual environment and attempt to install pytorch with cuda121 support, alongside torchvision and others, I used this pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121 If the user already messed up xformers following the error message above, they should pip install xformers==0.0.22.post7 as one of the packages require xformers to be under 0.0.23, but 0.0.22 and below fails to meet other requirements. Hope this helps other user in the future.

Hello @vincent0408,

The next problem occurs at line 84 when saving the wav file, as scipy.io.wavfile.write seems to dislike float16, so I changed the line to write_wav(savename, sample_rate, audio_array.astype(np.float32)), and I get a working bark output.

Thanks for the feedback. I'll commit a bugfix asap.

WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.1.0+cu121 with CUDA 1202 (you have 2.1.0+cpu) Python 3.10.12 (you have 3.10.12) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)

See this comment for explanations on this message : it's misleading, and has nothing to do with the real error you are encountering : biniou is not compatible with Python 3.12, as PyTorch don't provide a wheel compiled for python 3.12 for version 2.1.0.

The windows installer is pretty "quick'n'dirty" and should not be used on an environment that already have different versions of the prerequisites needed by biniou.

Also, the previously mentioned comment explains how to activate CUDA support for image generation (you can also activate it for llama, but there's few chances that it will work), without using the command line :

Connect to the Global settings panel in the "Global settings" -> "Global settings login" tab, using default credentials biniou/biniou or customs one defined in the first line of .ini/auth.cfg .
Go to the "WebUI control" tab
In "Optimization type", select CUDA and click "Update biniou"
In "Llama-cpp-python backend", select CUDA -or any other backend available and ALREADY WORKING on your computer-, and click "update llama-cpp-python backend".
Relaunch biniou / restart webui and check if that does some magic ...

Thanks again for your feedbacks !

Woolverine94 / biniou

Bark - Expected all tensors to be on the same device, but found cpu and cuda #37