intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Apache License 2.0
6.76k stars 1.27k forks source link

IPEX-LLM on Intel Max Series 1100 for inference libintel-ext-pt-gpu.so: undefined symbol: _ZNK5torch8autograd4Node4nameB5cxx11Ev #10941

Open shailesh837 opened 6 months ago

shailesh837 commented 6 months ago

I am trying to run Synthesizing speech by TTS:

https://docs.coqui.ai/en/latest/

(llm) spandey2@imu-nex-sprx92-max1-sut:~/1worldsync_finetuning$ cat tts.py

import torch
from TTS.api import TTS
from ipex_llm import optimize_model

#Get device
device = 'xpu'  # Use 'xpu' to indicate Intel GPU

#List availableTTS models
print(TTS().list_models())

# Init TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")

tts = optimize_model(tts, cpu_embedding=True)
tts = tts.to(device)

wav = tts.tts(text="Hello world!", speaker_wav="./audio.wav", language="en").to('cpu')  # Move output to CPU if needed
tts.tts_to_file(text="Hello world!", speaker_wav="./audio.wav", language="en", file_path="output.wav")

ENV Set: source /opt/intel/oneapi/setvars.sh export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_CACHE_PERSISTENT=1 export ENABLE_SDP_FUSION=1

Error :

Dependency issue as well: 
pip install TTS  # from PyPI
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tts 0.22.0 requires numpy==1.22.0; python_version <= "3.10", but you have numpy 1.26.4 which is incompatible.
tts 0.22.0 requires torch>=2.1, but you have torch 2.1.0a0+cxx11.abi which is incompatible.
tts 0.22.0 requires transformers>=4.33.0, but you have transformers 4.31.0 which is incompatible.
torchaudio 2.3.0 requires torch==2.3.0, but you have torch 2.1.0a0+cxx11.abi which is incompatible.
Successfully installed intel-extension-for-pytorch-2.1.10+xpu numpy-1.26.4 tokenizers-0.13.3 torch-2.1.0a0+cxx11.abi torchvision-0.16.0a0+cxx11.abi transformers-4.31.0

(llm) spandey2@imu-nex-sprx92-max1-sut:~/1worldsync_finetuning$ python tts.py
ERROR: ld.so: object '/home/spandey2/miniconda3/envs/llm/lib/libtcmalloc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
/home/spandey2/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/spandey2/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/spandey2/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/spandey2/miniconda3/envs/llm/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
Traceback (most recent call last):
  File "/home/spandey2/1worldsync_finetuning/tts.py", line 3, in <module>
    from ipex_llm import optimize_model
  File "/home/spandey2/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/__init__.py", line 34, in <module>
    ipex_importer.import_ipex()
  File "/home/spandey2/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/utils/ipex_importer.py", line 59, in import_ipex
    import intel_extension_for_pytorch as ipex
  File "/home/spandey2/miniconda3/envs/llm/lib/python3.11/site-packages/intel_extension_for_pytorch/__init__.py", line 94, in <module>
    from .utils._proxy_module import *
  File "/home/spandey2/miniconda3/envs/llm/lib/python3.11/site-packages/intel_extension_for_pytorch/utils/_proxy_module.py", line 2, in <module>
    import intel_extension_for_pytorch._C
ImportError: /home/spandey2/miniconda3/envs/llm/lib/python3.11/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so: undefined symbol: _ZNK5torch8autograd4Node4nameB5cxx11Ev
qiyuangong commented 6 months ago

Seems related to this issue. https://github.com/intel/intel-extension-for-pytorch/issues/317

@gc-fu Can we solve this issue in conda env?

gc-fu commented 6 months ago

The issue https://github.com/intel/intel-extension-for-pytorch/issues/317 was caused by self-built torch with pre-built intel-extension-for-pytorch.

I have tried install these two dependencies in conda and then import intel_extension_for_pytorch as ipex. Everything works fine.

Can you try these through conda?

qiyuangong commented 6 months ago

Hi @shailesh837

Can you share your OS and kernel version? This error is raised by intel-extension-for-pytorch. In most cases, it's caused by OS or glibc out of date.

shailesh837 commented 6 months ago

I have ubuntu 22.04 LTS and Linux imu-nex-nuc13x2-arc770-dut 6.5.0-26-generic .

@gc-fu : Did you managed to run the code : import torch from TTS.api import TTS from ipex_llm import optimize_model

Get device

device = 'xpu' # Use 'xpu' to indicate Intel GPU

List availableTTS models

print(TTS().list_models())

Init TTS

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")

tts = optimize_model(tts, cpu_embedding=True) tts = tts.to(device)

wav = tts.tts(text="Hello world!", speaker_wav="./audio.wav", language="en").to('cpu') # Move output to CPU if needed tts.tts_to_file(text="Hello world!", speaker_wav="./audio.wav", language="en", file_path="output.wav")

gc-fu commented 6 months ago

After some investigation, the issue was caused not installing the correct torchaudio version. Install using these instructions:

pip install TTS
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
# Then pip install torchaudio
pip install torchaudio==2.1.0a0  --extra-index-url
https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

The code results: image The unknown symbol issue should have been addressed.

The error here should be caused by the fact that ipex-llm not supporting TTS models for now.

shailesh837 commented 6 months ago

I have managed to run the below TTS code on XPU, but it takes 23 seconds , seriously, it takes 3 second on CPU : Please can we check what issue on XPU: `(tts) spandey2@imu-nex-nuc13x2-arc770-dut:~/tts$ cat andrej_code_tts.py

In case of proxies, remove .intel.com from no proxies:

import os

os.environ['no_proxy'] = '10.0.0.0/8,192.168.0.0/16,localhost,127.0.0.0/8,134.134.0.0/16'

os.environ['NO_PROXY'] = '10.0.0.0/8,192.168.0.0/16,localhost,127.0.0.0/8,134.134.0.0/16'

IPEX

import subprocess

subprocess.run(["python", "-m", "pip", "install", "torch==2.1.0.post2", "torchvision==0.16.0.post2", "torchaudio==2.1.0.post2",

"intel-extension-for-pytorch==2.1.30+xpu", "oneccl_bind_pt==2.1.300+xpu",

"--extra-index-url", "https://pytorch-extension.intel.com/release-whl/stable/xpu/us/"])

TTS dependency. Do it on TERMINAL:

sudo apt install espeak-ng

let's check also python can see the device

import torch import intel_extension_for_pytorch as ipex print(torch.version) print(ipex.version) [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())]

from TTS.utils.manage import ModelManager from TTS.utils.synthesizer import Synthesizer

from IPython.display import Audio

import numpy as np import soundfile as sf

model_manager = ModelManager() model_path, config_path, model_item = model_manager.download_model("tts_models/en/vctk/vits") synthesizer = Synthesizer(model_path, config_path, use_cuda=False)

Move the model to GPU and optimize it using IPEX

synthesizer.tts_model.to('xpu') synthesizer.tts_model.eval() # Set the model to evaluation mode for inference synthesizer.tts_model = ipex.optimize(synthesizer.tts_model, dtype=torch.float32)

speaker_manager = synthesizer.tts_model.speaker_manager speaker_names = list(speaker_manager.name_to_id.keys()) print("Available speaker names:", speaker_names)

speaker_name = "p229" # Replace with the actual speaker name you want to use

text = "Your last lap time was 117.547 seconds. That's a bit slower than your best, but you're still doing well. Keep pushing, a really good lap is around 100 seconds. You've got this, let's keep improving."

Move input data to GPU and run inference with autocast for potential mixed precision

with torch.no_grad(), torch.xpu.amp.autocast(enabled=False): wavs = synthesizer.tts(text, speaker_name=speaker_name)

if isinstance(wavs, list):

Convert each NumPy array or scalar in the list to a PyTorch tensor

tensor_list = [torch.tensor(wav, dtype=torch.float32).unsqueeze(0) if np.isscalar(wav) else torch.tensor(wav, dtype=torch.float32) for wav in wavs]
# Concatenate the tensor list into a single tensor
wav_concatenated = torch.cat(tensor_list, dim=0)

else:

If 'wavs' is already a tensor, use it directly

wav_concatenated = wavs

Move the tensor to CPU and convert to NumPy array

wav_concatenated = wav_concatenated.cpu().numpy()

Save the output to a WAV file

output_path = "output_vctk_vits.wav" sf.write(output_path, wav_concatenated, synthesizer.tts_config.audio['sample_rate']) `

gc-fu commented 6 months ago

The code you provided is using ipex, not ipex-llm. Please do check again.

The intel-extension-for-pytorch repo address is here: https://github.com/intel/intel-extension-for-pytorch

Your related code snippet:

import intel_extension_for_pytorch as ipex
synthesizer.tts_model = ipex.optimize(synthesizer.tts_model, dtype=torch.float32)