Closed vincent0408 closed 2 weeks ago
Hello @vincent0408,
Thanks for reporting this bug and your interest in the project.
I'll try to fix it asap.
@vincent0408,
Commit af11510 may fix this issue : as I don't have access to CUDA hardware, I can't validate that it works, but I've tested on CPU and it's worked fine.
Can you confirm that it solve the issue on your side ?
Sorry for the late reply, I can confirm that moving the tensors in the commit does solve the tensor device problems. The next problem occurs at line 84 when saving the wav file, as scipy.io.wavfile.write seems to dislike float16, so I changed the line to
write_wav(savename, sample_rate, audio_array.astype(np.float32))
, and I get a working bark output. Don't know if float32 would be your choice as well, I just simply chose a dtype that would work.
Also, I really enjoyed your project and wanted to share some experience when installing as a Windows user. The error that I faced was the version conflict between pytorch and xformers. After a clean setup, xformers will complain
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.1.0+cu121 with CUDA 1202 (you have 2.1.0+cpu) Python 3.10.12 (you have 3.10.12) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Somehow during the installation, the cpu version of pytorch is installed, so one simply needs to navigate under the virtual environment and attempt to install pytorch with cuda121 support, alongside torchvision and others, I used this
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
If the user already messed up xformers following the error message above, they should pip install xformers==0.0.22.post7
as one of the packages require xformers to be under 0.0.23, but 0.0.22 and below fails to meet other requirements. Hope this helps other user in the future.
Hello @vincent0408,
The next problem occurs at line 84 when saving the wav file, as scipy.io.wavfile.write seems to dislike float16, so I changed the line to
write_wav(savename, sample_rate, audio_array.astype(np.float32))
, and I get a working bark output.
Thanks for the feedback. I'll commit a bugfix asap.
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.1.0+cu121 with CUDA 1202 (you have 2.1.0+cpu) Python 3.10.12 (you have 3.10.12) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
See this comment for explanations on this message : it's misleading, and has nothing to do with the real error you are encountering : biniou is not compatible with Python 3.12, as PyTorch don't provide a wheel compiled for python 3.12 for version 2.1.0.
The windows installer is pretty "quick'n'dirty" and should not be used on an environment that already have different versions of the prerequisites needed by biniou.
Also, the previously mentioned comment explains how to activate CUDA support for image generation (you can also activate it for llama, but there's few chances that it will work), without using the command line :
Thanks again for your feedbacks !
Describe the bug Both bark and bark-small get the tensors are on cpu and cuda error, I don't have experience in bark programs, but I did try commenting out the
enable_cpu_offload()
inbark.py
line 75 but same error remains.Console log
Hardware (please complete the following information):
Additional informations