Open nedo99 opened 5 months ago
ValueError: Unrecognized configuration class
This error is not quite relevant to bigdl's support. The error message indicates the AutoModelForCausalLM
does not support loading the speech t5 model. It could be that you're using the incorrect AutoClass, or the transformers version is not updated, or transformers does not support using AutoClasses to load this model.
Regarding to bigdl-llm usage, 1) auto classes in bigdl.llm.transformers
and 2) optimize_model
are two sets of APIs and you can choose either of them (not both) to optimize a model.
use auto classes from bigdl.llm.transformers
if the model supports auto class loading (take AutoModelForCasualLM as example):
from bigdl.llm.transformers import AutoModelForCausalLM
# specifcy load_in_4bit=True to load the model directly in 4bit
bigdl_model = AutoModelForCausalLM.from_pretrained('/path/to/model/', load_in_4bit=True)
use optimize_model
for arbitrary pytorch model loading:
# normal way of loading a model using a model loader class XXXModel, e.g. `SpeechT5Model`,
from transformers import XXXModel
original_model = XXXModel.from_pretrained(...)
from bigdl.llm import optimize_model
bigdl_model = optimize_model(original_model)
According to transformers speecht5 doc, you can use SpeechT5Model
to load the model, you may try loading the model using optimize_model
like following.
from transformers import SpeechT5Model
...
model = SpeechT5Model.from_pretrained("microsoft/speecht5_tts")
from bigdl.llm import optimize_model
bigdl_model = optimize_model(model)
bigdl_model.to('xpu')
SpeechT5 model can be successfully loaded using bigdl using AutoModelForSpeechSeq2Seq
, instead of AutoModelForCasualLM
. Below code works in our test (using transformers version 4.31.0 and bigdl version 2.5.0b20240124).
from bigdl.llm.transformers import AutoModelForSpeechSeq2Seq
...
model = AutoModelForSpeechSeq2Seq.from_pretrained("microsoft/speecht5_tts", load_in_4bit=True)
model = model.to("xpu")
...
SpeechT5 model can be successfully loaded using bigdl using
AutoModelForSpeechSeq2Seq
, instead ofAutoModelForCasualLM
. Below code works in our test (using transformers version 4.31.0 and bigdl version 2.5.0b20240124).use
optimize_model
from bigdl.llm.transformers import AutoModelForSpeechSeq2Seq ... model = AutoModelForSpeechSeq2Seq.from_pretrained("microsoft/speecht5_tts", load_in_4bit=True) model = model.to("xpu") ...
I updated my bigdl version, but now I am getting segfault. Here is backtrace:
in xpu::dpcpp::initGlobalDevicePoolState() () from /envs/llm/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so
(gdb) bt
#0 0x00007fff17d3a9cb in xpu::dpcpp::initGlobalDevicePoolState() ()
from envs/llm/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so
#1 0x00007ffff7c99ee8 in __pthread_once_slow (once_control=0x7fff2a0888e0 <xpu::dpcpp::init_device_flag>,
init_routine=0x7fffc24dac90 <std::__once_proxy()>) at ./nptl/pthread_once.c:116
#2 0x00007fff17d37291 in xpu::dpcpp::dpcppGetDeviceCount(int*) ()
from envs/llm/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so
#3 0x00007fff17cf0912 in xpu::dpcpp::device_count()::{lambda()#1}::operator()() const ()
from envs/llm/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so
#4 0x00007fff17cf08d8 in xpu::dpcpp::device_count() ()
According to the backtrace, it seems like issue with finding the GPU. sycl-ls
shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?
According to the backtrace, it seems like issue with finding the GPU.
sycl-ls
shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?
Yes, the default BigDL-LLM has upgraded to PyTorch 2.1/oneAPI 2024.0, and you will need to upgrade your oneAPI.
Alternatively, you may continue to install PyTorch 2.0 version of BigDL-LLM, which is compatible with oneAPI 2023.2 (see https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux).
According to the backtrace, it seems like issue with finding the GPU.
sycl-ls
shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?Yes, the default BigDL-LLM has upgraded to PyTorch 2.1/oneAPI 2024.0, and you will need to upgrade your oneAPI.
Alternatively, you may continue to install PyTorch 2.0 version of BigDL-LLM, which is compatible with oneAPI 2023.2 (see https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux).
But with oneAPI 2023.2 this does not work and it segfaults as mentioned in the previous comment. With oneAPI 2024 I still did not get anything working since I am getting an error message: ImportError: libsycl.so.6: cannot open shared object file: No such file or directory
. I tried both oneAPI v 2024.0.0.49564
and 2024.0.2-49895
with Pytorch 2.1.
According to the backtrace, it seems like issue with finding the GPU.
sycl-ls
shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?Yes, the default BigDL-LLM has upgraded to PyTorch 2.1/oneAPI 2024.0, and you will need to upgrade your oneAPI. Alternatively, you may continue to install PyTorch 2.0 version of BigDL-LLM, which is compatible with oneAPI 2023.2 (see https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux).
But with oneAPI 2023.2 this does not work and it segfaults as mentioned in the previous comment. With oneAPI 2024 I still did not get anything working since I am getting an error message:
ImportError: libsycl.so.6: cannot open shared object file: No such file or directory
. I tried both oneAPI v2024.0.0.49564
and2024.0.2-49895
with Pytorch 2.1.
Did you correctly configure the OneAPI env variables (refer to the instructions here)? And also pay attention to the runtime configurations instructions here which may prevent lots of runtime issues.
According to the backtrace, it seems like issue with finding the GPU.
sycl-ls
shows a discrete GPU. I am using oneAPI 2023.2 and kernel 5.19-0.41 and is working for other models. Could the issue be oneAPI version?Yes, the default BigDL-LLM has upgraded to PyTorch 2.1/oneAPI 2024.0, and you will need to upgrade your oneAPI. Alternatively, you may continue to install PyTorch 2.0 version of BigDL-LLM, which is compatible with oneAPI 2023.2 (see https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux).
But with oneAPI 2023.2 this does not work and it segfaults as mentioned in the previous comment. With oneAPI 2024 I still did not get anything working since I am getting an error message:
ImportError: libsycl.so.6: cannot open shared object file: No such file or directory
. I tried both oneAPI v2024.0.0.49564
and2024.0.2-49895
with Pytorch 2.1.Did you correctly configure the OneAPI env variables (refer to the instructions here)? And also pay attention to the runtime configurations instructions here which may prevent lots of runtime issues.
Yes and yes, but the issue is still there.
Could you provide the os, kernel and python version?
With oneAPI 2024 I still did not get anything working since I am getting an error message:
ImportError: libsycl.so.6: cannot open shared object file: No such file or directory
. I tried both oneAPI v2024.0.0.49564
and2024.0.2-49895
with Pytorch 2.1.
To resolve this problem and use oneAPI 2024.0, it is recommended creating a new conda env through:
conda create -n new-llm-env python=3.9
conda activate new-llm-env
pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
Or if you would like to use BigDL-LLM with oneAPI 2024.0 in your old conda environment, you could:
pip uninstall bigdl-core-xe
pip uninstall bigdl-core-xe-21
pip uninstall bigdl-core-xe-esimd
pip uninstall bigdl-core-xe-esimd-21
pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
Note that bigdl-llm
, bigdl-core-xe-21
and bigdl-core-xe-esimd-21
should have the same version if bigdl-llm
has been upgraded to the one with oneAPI 2024.0/PyTorch 2.1 correctly.
Could you provide the os, kernel and python version?
OS: Ubuntu 22.04 kernel: 5.19.0-41-generic Python 3.9.18 oneAPI: 2024.0.0.49564 bigdl-llm: 2.5.0b20240201 bigdl_core_xe: 2.5.0b20240201 bigdl_core_xe_esimd: 2.5.0b20240201 intel_extension_for_pytorch: 2.1.10+xpu
Hi @nedo99,
For bigdl-llm>=2.5.0b20240204
, you could run speech t5 with BigDL-LLM optimization as below :)
Env (PyTorch 2.1 with oneAPI 2024.0):
conda create -n speecht5-test python=3.9
conda activate speecht5-test
pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
pip install datasets soundfile
Runtime Configuration: following here
Code:
import torch
from transformers import SpeechT5Processor, SpeechT5HifiGan, SpeechT5ForTextToSpeech
from datasets import load_dataset
import soundfile as sf
import time
processor = SpeechT5Processor.from_pretrained("microsoft/speecht5_tts")
model = SpeechT5ForTextToSpeech.from_pretrained("microsoft/speecht5_tts")
vocoder = SpeechT5HifiGan.from_pretrained("microsoft/speecht5_hifigan")
from bigdl.llm import optimize_model
model = optimize_model(model, modules_to_not_convert=["speech_decoder_postnet.feat_out",
"speech_decoder_postnet.prob_out"])
model = model.to('xpu')
vocoder = vocoder.to('xpu')
text = "On a cold winter night, a lonely traveler found a shimmering stone in the snow, unaware that it would lead him to a world full of wonders."
inputs = processor(text=text, return_tensors="pt").to('xpu')
# load xvector containing speaker's voice characteristics from a dataset
embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors",
split="validation")
speaker_embeddings = torch.tensor(embeddings_dataset[7306]["xvector"]).unsqueeze(0).to('xpu')
with torch.inference_mode():
# wamrup
st = time.perf_counter()
speech = model.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)
print(f'Warmup time: {time.perf_counter() - st}')
st1 = time.perf_counter()
speech = model.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)
torch.xpu.synchronize()
st2 = time.perf_counter()
print(f"Inference time: {st2-st1}")
sf.write("speech_bigdl_llm.wav", speech.to('cpu').numpy(), samplerate=16000)
Please let us know for any further problems :)
If you would be also interested in other TTS models we support, you can run Bark with BigDL-LLM optimization as follows :)
Env (PyTorch 2.1 with oneAPI 2024.0):
conda create -n bark-test python=3.9
conda activate bark-test
pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
pip install scipy
Runtime Configuration: following here
Code:
from transformers import AutoProcessor, BarkModel
import torch
import time
processor = AutoProcessor.from_pretrained("suno/bark-small")
model = BarkModel.from_pretrained("suno/bark-small")
from bigdl.llm import optimize_model
model = optimize_model(model).to('xpu')
voice_preset = "v2/en_speaker_6"
text = "Hello, my name is Suno. And, uh — and I like pizza. [laughs] But I also have other interests such as playing tic tac toe."
inputs = processor(text, voice_preset=voice_preset).to('xpu')
# warmup
st = time.time()
with torch.inference_mode():
model.generate(**inputs)
torch.xpu.synchronize()
print(f"Warmup time: {time.time() - st}")
st = time.time()
with torch.inference_mode():
audio_array = model.generate(**inputs)
torch.xpu.synchronize()
print(f"Inference time: {time.time() - st}")
audio_array = audio_array.cpu().numpy().squeeze()
from scipy.io.wavfile import write as write_wav
sample_rate = model.generation_config.sample_rate
write_wav("output/bark_generation_bigdl_llm.wav", sample_rate, audio_array)
Speech T5 sample works.
Bark does not work. It segfaults and has the same backtrace as posted in one of the previous comments.
Speech T5 sample works.
Bark does not work. It segfaults and has the same backtrace as posted in one of the previous comments.
Hi @nedo99 ,
Could you let me know your test env for Bark?
Could you provide the os, kernel and python version?
OS: Ubuntu 22.04 kernel: 5.19.0-41-generic Python 3.9.18 oneAPI: 2024.0.0.49564 bigdl-llm: 2.5.0b20240201 bigdl_core_xe: 2.5.0b20240201 bigdl_core_xe_esimd: 2.5.0b20240201 intel_extension_for_pytorch: 2.1.10+xpu
What shows here seems not be a correct PyTorch 2.1 env for me :) You could try the steps here for a correct PyTorch 2.1 + oneAPI 2024.0 env for bigdl-llm
: https://github.com/intel-analytics/BigDL/issues/10025#issuecomment-1920570410
Speech T5 sample works. Bark does not work. It segfaults and has the same backtrace as posted in one of the previous comments.
Hi @nedo99 ,
Could you let me know your test env for Bark?
Could you provide the os, kernel and python version?
OS: Ubuntu 22.04 kernel: 5.19.0-41-generic Python 3.9.18 oneAPI: 2024.0.0.49564 bigdl-llm: 2.5.0b20240201 bigdl_core_xe: 2.5.0b20240201 bigdl_core_xe_esimd: 2.5.0b20240201 intel_extension_for_pytorch: 2.1.10+xpu
What shows here seems not be a correct PyTorch 2.1 env for me :) You could try the steps here for a correct PyTorch 2.1 + oneAPI 2024.0 env for
bigdl-llm
: #10025 (comment)
Here is the updated environment:
pip list | grep bigdl
bigdl-core-xe-21 2.5.0b20240206
bigdl-core-xe-esimd-21 2.5.0b20240206
bigdl-llm 2.5.0b20240206
Name: intel-extension-for-pytorch
Version: 2.1.10+xpu
oneAPI 2024
Hello,
I am trying to run Speech T5 on XPU but am unable to. It is this model https://huggingface.co/microsoft/speecht5_tts and here is my code:
and I am getting the following error:
Is there support for text-to-speech by BigDL? Or am I missing something?
Regards, Nedim