intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.5k stars 1.24k forks source link

Unable to run on dGPU #10515

Closed dyedd closed 5 months ago

dyedd commented 6 months ago

hello, I tried to run the code from https://gitee.com/Pauntech/chat-glm3/blob/master/chatglm3_web_demo.py, but I face a problem.

LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
LIBXSMM_TARGET: adl [13th Gen Intel(R) Core(TM) i5-13600KF]
Registry and code: 13 MB
Uptime: 6.540896 s
Segmentation fault (core dumped)

you can see that it run on the cpu? but the code clearly offload to the xpu. the result of sycl-ls

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2023.16.12.0.12_195853.xmain-hotfix]
[opencl:cpu:1] Intel(R) OpenCL, 13th Gen Intel(R) Core(TM) i5-13600KF OpenCL 3.0 (Build 0) [2023.16.12.0.12_195853.xmain-hotfix]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO  [23.35.27191.42]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.27191]

I used the method from https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/KeyFeatures/multi_gpus_selection.html,but it still fail.


I also run the code from https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm3/streamchat.py. but I face the problem:

Traceback (most recent call last):
  File "/home/dyedd/projects/agent/test/streamchat.py", line 59, in <module>
    output = model.generate(input_ids,
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/generation/utils.py", line 1538, in generate
    return self.greedy_search(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/generation/utils.py", line 2362, in greedy_search
    outputs = self(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 938, in forward
    transformer_outputs = self.transformer(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/bigdl/llm/transformers/models/chatglm2.py", line 167, in chatglm2_model_forward
    hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 641, in forward
    layer_ret = layer(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 545, in forward
    attention_output, kv_cache = self.self_attention(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/bigdl/llm/transformers/models/chatglm2.py", line 191, in chatglm2_attention_forward
    return forward_function(
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/bigdl/llm/transformers/models/chatglm2.py", line 437, in chatglm2_attention_forward_8eb45c
    key_layer, value_layer = append_kv_cache(cache_k, cache_v, key_layer, value_layer)
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/bigdl/llm/transformers/models/utils.py", line 66, in append_kv_cache
    new_cache_k[:, :, cache_k.size(2):cache_k.size(2) + key_states.size(2), :] = key_states
RuntimeError: The expanded size of the tensor (2) must match the existing size (32) at non-singleton dimension 1.  Target sizes: [1, 2, 1, 128].  Tensor sizes: [32, 1, 128]

so,how to run the code in dGPU.?

jason-dai commented 6 months ago

Can you try the linux or windows quickstart to verify the installation?

dyedd commented 6 months ago

Can you try the linux or windows quickstart to verify the installation?

The installation is no problem。because the code can run well.

Oscilloscope98 commented 5 months ago

This bug will be fixed in https://github.com/intel-analytics/ipex-llm/pull/10540 :)

You could try again tomorrow with ipex-llm>=2.1.0b20240326. Refer to here for more installation details regarding ipex-llm. And please let us know for any further questions :)

dyedd commented 5 months ago

This bug will be fixed in #10540 :)

You could try again tomorrow with ipex-llm>=2.1.0b20240326. Refer to here for more installation details regarding ipex-llm. And please let us know for any further questions :)

NO,it still have new problem.

dyedd commented 5 months ago

@Oscilloscope98

I also find ipex-llm can't support streamlit now.notice: this code still need replace new api

error ``` PackageNotFoundError: No package metadata was found for bitsandbytes Traceback: File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 542, in _run_script exec(code, module.__dict__) File "/home/dyedd/projects/agent/test/web.py", line 31, in tokenizer, model = get_model() File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 210, in wrapper return cached_func(*args, **kwargs) File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 239, in __call__ return self._get_or_create_cached_value(args, kwargs) File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 266, in _get_or_create_cached_value return self._handle_cache_miss(cache, value_key, func_args, func_kwargs) File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 322, in _handle_cache_miss computed_value = self._info.func(*func_args, **func_kwargs) File "/home/dyedd/projects/agent/test/web.py", line 18, in get_model model = AutoModel.from_pretrained(model_path, File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 488, in from_pretrained return model_class.from_pretrained( File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2256, in from_pretrained quantization_config, kwargs = BitsAndBytesConfig.from_dict( File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 189, in from_dict config = cls(**config_dict) File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 118, in __init__ self.post_init() File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 144, in post_init if self.load_in_4bit and not version.parse(importlib.metadata.version("bitsandbytes")) >= version.parse( File "/home/dyedd/.conda/envs/pt/lib/python3.10/importlib/metadata/__init__.py", line 996, in version return distribution(distribution_name).version File "/home/dyedd/.conda/envs/pt/lib/python3.10/importlib/metadata/__init__.py", line 969, in distribution return Distribution.from_name(distribution_name) File "/home/dyedd/.conda/envs/pt/lib/python3.10/importlib/metadata/__init__.py", line 548, in from_name raise PackageNotFoundError(name) ```
Zhangky11 commented 5 months ago

This bug will be fixed in #10540 :) You could try again tomorrow with ipex-llm>=2.1.0b20240326. Refer to here for more installation details regarding ipex-llm. And please let us know for any further questions :)

NO,it still have new problem.

  • chatglm3/streamchat.py: --disable-stream:ipex_llm/transformers/models/chatglm2.py", line 432, in chatglm2_attention_forward_8eb45c new_cache_k[:] = cache_k RuntimeError: Native API failed. Native API returns: -6 (PI_ERROR_OUT_OF_HOST_MEMORY) -6 (PI_ERROR_OUT_OF_HOST_MEMORY);
  • chatglm3/streamchat.py: please notice it no disable-stream:
-------------------- Stream Chat Output --------------------
Traceback (most recent call last):
  File "/home/dyedd/projects/agent/./test/streamchat.py", line 57, in <module>
    for response, history in model.stream_chat(tokenizer, args.question, history=[]):
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1078, in stream_chat
    response, new_history = self.process_response(response, history)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1004, in process_response
    metadata, content = response.split("\n", maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)

We haven't been able to reproduce this issue yet on our Arc A770. Would you mind running the python/llm/scripts/env-check.sh script and paste the output here so that we can have more information regarding your environment?

dyedd commented 5 months ago

This bug will be fixed in #10540 :) You could try again tomorrow with ipex-llm>=2.1.0b20240326. Refer to here for more installation details regarding ipex-llm. And please let us know for any further questions :)

NO,it still have new problem.

  • chatglm3/streamchat.py: --disable-stream:ipex_llm/transformers/models/chatglm2.py", line 432, in chatglm2_attention_forward_8eb45c new_cache_k[:] = cache_k RuntimeError: Native API failed. Native API returns: -6 (PI_ERROR_OUT_OF_HOST_MEMORY) -6 (PI_ERROR_OUT_OF_HOST_MEMORY);
  • chatglm3/streamchat.py: please notice it no disable-stream:
-------------------- Stream Chat Output --------------------
Traceback (most recent call last):
  File "/home/dyedd/projects/agent/./test/streamchat.py", line 57, in <module>
    for response, history in model.stream_chat(tokenizer, args.question, history=[]):
  File "/home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1078, in stream_chat
    response, new_history = self.process_response(response, history)
  File "/home/dyedd/.cache/huggingface/modules/transformers_modules/chatglm3-6b-base/modeling_chatglm.py", line 1004, in process_response
    metadata, content = response.split("\n", maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)

We haven't been able to reproduce this issue yet on our Arc A770. Would you mind running the python/llm/scripts/env-check.sh script and paste the output here so that we can have more information regarding your environment?

NO problem.

env-check.sh ``` ----------------------------------------------------------------- PYTHON_VERSION=3.10.13 ----------------------------------------------------------------- transformers=4.31.0 ----------------------------------------------------------------- PyTorch is not installed. ----------------------------------------------------------------- ipex-llm Version: 2.1.0b20240326 ----------------------------------------------------------------- IPEX is not installed. ----------------------------------------------------------------- CPU Information: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 20 On-line CPU(s) list: 0-19 Vendor ID: GenuineIntel Model name: 13th Gen Intel(R) Core(TM) i5-13600KF CPU family: 6 Model: 183 Thread(s) per core: 2 Core(s) per socket: 14 Socket(s): 1 Stepping: 1 CPU max MHz: 5100.0000 CPU min MHz: 800.0000 BogoMIPS: 6988.80 ----------------------------------------------------------------- MemTotal: 65679356 kB ----------------------------------------------------------------- ulimit: real-time non-blocking time (microseconds, -R) unlimited core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 256260 max locked memory (kbytes, -l) 8209916 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 256260 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited ----------------------------------------------------------------- Operating System: Ubuntu 22.04.4 LTS \n \l ----------------------------------------------------------------- Environment Variable: SHELL=/bin/bash CONDA_EXE=/usr/local/miniconda3/bin/conda _CE_M= LC_ADDRESS=zh_CN.UTF-8 LC_NAME=zh_CN.UTF-8 LC_MONETARY=zh_CN.UTF-8 PWD=/home/dyedd/projects/agent LOGNAME=dyedd XDG_SESSION_TYPE=tty CONDA_PREFIX=/home/dyedd/.conda/envs/pt MOTD_SHOWN=pam HOME=/home/dyedd LANG=en_US.UTF-8 LC_PAPER=zh_CN.UTF-8 LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36: CONDA_PROMPT_MODIFIER=(pt) SSH_CONNECTION=127.0.0.1 57220 127.0.0.1 22 LESSCLOSE=/usr/bin/lesspipe %s %s XDG_SESSION_CLASS=user TERM=xterm-256color LC_IDENTIFICATION=zh_CN.UTF-8 _CE_CONDA= LESSOPEN=| /usr/bin/lesspipe %s USER=dyedd CONDA_SHLVL=2 SHLVL=2 LC_TELEPHONE=zh_CN.UTF-8 LC_MEASUREMENT=zh_CN.UTF-8 XDG_SESSION_ID=702 CONDA_PYTHON_EXE=/usr/local/miniconda3/bin/python XDG_RUNTIME_DIR=/run/user/1000 SSH_CLIENT=127.0.0.1 57220 22 CONDA_DEFAULT_ENV=pt LC_TIME=zh_CN.UTF-8 XDG_DATA_DIRS=/usr/share/gnome:/usr/local/share:/usr/share:/var/lib/snapd/desktop PATH=/home/dyedd/.conda/envs/pt/bin:/usr/local/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus SSH_TTY=/dev/pts/4 CONDA_PREFIX_1=/usr/local/miniconda3 LC_NUMERIC=zh_CN.UTF-8 OLDPWD=/home/dyedd/projects _=/usr/bin/printenv ----------------------------------------------------------------- xpu-smi is properly installed. ----------------------------------------------------------------- +-----------+--------------------------------------------------------------------------------------+ | Device ID | Device Information | +-----------+--------------------------------------------------------------------------------------+ | 0 | Device Name: Intel(R) Arc(TM) A770 Graphics | | | Vendor Name: Intel(R) Corporation | | | SOC UUID: 00000000-0000-0003-0000-000856a08086 | | | PCI BDF Address: 0000:03:00.0 | | | DRM Device: /dev/dri/card0 | | | Function Type: physical | +-----------+--------------------------------------------------------------------------------------+ ----------------------------------------------------------------- ```
Zhangky11 commented 5 months ago

Based on the provided environment information, it seems that PyTorch and IPEX are not installed. Could you please set up the correct environment and then run the shell script?

Zhangky11 commented 5 months ago

You can follow the guide below to set up the environment or check if the environment is correct: https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux

dyedd commented 5 months ago

You can follow the guide below to set up the environment or check if the environment is correct: https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux

I forgot to source oneAPI environment,so the bash don't check truely.

env-check.sh ``` ----------------------------------------------------------------- PYTHON_VERSION=3.10.13 ----------------------------------------------------------------- transformers=4.31.0 ----------------------------------------------------------------- torch=2.1.0a0+cxx11.abi ----------------------------------------------------------------- ipex-llm Version: 2.1.0b20240326 ----------------------------------------------------------------- /home/dyedd/.conda/envs/pt/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source? warn( ipex=2.1.10+xpu ----------------------------------------------------------------- CPU Information: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 20 On-line CPU(s) list: 0-19 Vendor ID: GenuineIntel Model name: 13th Gen Intel(R) Core(TM) i5-13600KF CPU family: 6 Model: 183 Thread(s) per core: 2 Core(s) per socket: 14 Socket(s): 1 Stepping: 1 CPU max MHz: 5100.0000 CPU min MHz: 800.0000 BogoMIPS: 6988.80 ----------------------------------------------------------------- MemTotal: 65679356 kB ----------------------------------------------------------------- ulimit: real-time non-blocking time (microseconds, -R) unlimited core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 256260 max locked memory (kbytes, -l) 8209916 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 256260 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited ----------------------------------------------------------------- Operating System: Ubuntu 22.04.4 LTS \n \l ----------------------------------------------------------------- Environment Variable: SHELL=/bin/bash TBBROOT=/opt/intel/oneapi/tbb/2021.11/env/.. ONEAPI_ROOT=/opt/intel/oneapi CONDA_EXE=/usr/local/miniconda3/bin/conda _CE_M= PKG_CONFIG_PATH=/opt/intel/oneapi/vtune/2024.0/include/pkgconfig/lib64:/opt/intel/oneapi/tbb/2021.11/env/../lib/pkgconfig:/opt/intel/oneapi/mpi/2021.11/lib/pkgconfig:/opt/intel/oneapi/mkl/2024.0/lib/pkgconfig:/opt/intel/oneapi/ippcp/2021.9/lib/pkgconfig:/opt/intel/oneapi/dpl/2022.3/lib/pkgconfig:/opt/intel/oneapi/dnnl/2024.0/lib/pkgconfig:/opt/intel/oneapi/dal/2024.0/lib/pkgconfig:/opt/intel/oneapi/compiler/2024.0/lib/pkgconfig:/opt/intel/oneapi/ccl/2021.11/lib/pkgconfig/:/opt/intel/oneapi/advisor/2024.0/include/pkgconfig/lib64: USE_XETLA=OFF LC_ADDRESS=zh_CN.UTF-8 ACL_BOARD_VENDOR_PATH=/opt/Intel/OpenCLFPGA/oneAPI/Boards LC_NAME=zh_CN.UTF-8 FPGA_VARS_DIR=/opt/intel/oneapi/compiler/2024.0/opt/oclfpga CCL_ROOT=/opt/intel/oneapi/ccl/2021.11 I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.11 LC_MONETARY=zh_CN.UTF-8 FI_PROVIDER_PATH=/opt/intel/oneapi/mpi/2021.11/opt/mpi/libfabric/lib/prov:/usr/lib/x86_64-linux-gnu/libfabric DNNLROOT=/opt/intel/oneapi/dnnl/2024.0 DIAGUTIL_PATH=/opt/intel/oneapi/dpcpp-ct/2024.0/etc/dpct/sys_check/sys_check.sh:/opt/intel/oneapi/debugger/2024.0/etc/debugger/sys_check/sys_check.py:/opt/intel/oneapi/compiler/2024.0/etc/compiler/sys_check/sys_check.sh ADVISOR_2024_DIR=/opt/intel/oneapi/advisor/2024.0 PWD=/home/dyedd/projects/agent CCL_CONFIGURATION=cpu_gpu_dpcpp LOGNAME=dyedd DPL_ROOT=/opt/intel/oneapi/dpl/2022.3 XDG_SESSION_TYPE=tty CONDA_PREFIX=/home/dyedd/.conda/envs/pt MANPATH=/opt/intel/oneapi/mpi/2021.11/share/man:/opt/intel/oneapi/debugger/2024.0/share/man:/opt/intel/oneapi/compiler/2024.0/documentation/en/man/common: MOTD_SHOWN=pam HOME=/home/dyedd GDB_INFO=/opt/intel/oneapi/debugger/2024.0/share/info/ CCL_CONFIGURATION_PATH= LANG=en_US.UTF-8 LC_PAPER=zh_CN.UTF-8 LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36: SETVARS_COMPLETED=1 CONDA_PROMPT_MODIFIER=(pt) APM=/opt/intel/oneapi/advisor/2024.0/perfmodels CMAKE_PREFIX_PATH=/opt/intel/oneapi/tbb/2021.11/env/..:/opt/intel/oneapi/mkl/2024.0/lib/cmake:/opt/intel/oneapi/ipp/2021.10/lib/cmake/ipp:/opt/intel/oneapi/dpl/2022.3/lib/cmake/oneDPL:/opt/intel/oneapi/dnnl/2024.0/lib/cmake:/opt/intel/oneapi/dal/2024.0:/opt/intel/oneapi/compiler/2024.0 SSH_CONNECTION=127.0.0.1 56490 127.0.0.1 22 CMPLR_ROOT=/opt/intel/oneapi/compiler/2024.0 FPGA_VARS_ARGS= INFOPATH=/opt/intel/oneapi/debugger/2024.0/opt/debugger/lib IPPROOT=/opt/intel/oneapi/ipp/2021.10 IPP_TARGET_ARCH=intel64 LESSCLOSE=/usr/bin/lesspipe %s %s XDG_SESSION_CLASS=user PYTHONPATH=/opt/intel/oneapi/advisor/2024.0/pythonapi TERM=xterm-256color LC_IDENTIFICATION=zh_CN.UTF-8 _CE_CONDA= DALROOT=/opt/intel/oneapi/dal/2024.0 LESSOPEN=| /usr/bin/lesspipe %s USER=dyedd LIBRARY_PATH=/opt/intel/oneapi/tbb/2021.11/env/../lib/intel64/gcc4.8:/opt/intel/oneapi/mpi/2021.11/lib:/opt/intel/oneapi/mkl/2024.0/lib/:/opt/intel/oneapi/ippcp/2021.9/lib/:/opt/intel/oneapi/ipp/2021.10/lib:/opt/intel/oneapi/dpl/2022.3/lib:/opt/intel/oneapi/dnnl/2024.0/lib:/opt/intel/oneapi/dal/2024.0/lib:/opt/intel/oneapi/compiler/2024.0/lib:/opt/intel/oneapi/ccl/2021.11/lib/ DAL_MAJOR_BINARY=2 SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 CONDA_SHLVL=2 IPPCRYPTOROOT=/opt/intel/oneapi/ippcp/2021.9 IPPCP_TARGET_ARCH=intel64 SHLVL=2 LC_TELEPHONE=zh_CN.UTF-8 VTUNE_PROFILER_2024_DIR=/opt/intel/oneapi/vtune/2024.0 OCL_ICD_FILENAMES=libintelocl_emu.so:libalteracl.so:/opt/intel/oneapi/compiler/2024.0/lib/libintelocl.so LC_MEASUREMENT=zh_CN.UTF-8 XDG_SESSION_ID=708 CONDA_PYTHON_EXE=/usr/local/miniconda3/bin/python CLASSPATH=/opt/intel/oneapi/mpi/2021.11/share/java/mpi.jar INTELFPGAOCLSDKROOT=/opt/intel/oneapi/compiler/2024.0/opt/oclfpga LD_LIBRARY_PATH=/opt/intel/oneapi/tbb/2021.11/env/../lib/intel64/gcc4.8:/opt/intel/oneapi/mpi/2021.11/opt/mpi/libfabric/lib:/opt/intel/oneapi/mpi/2021.11/lib:/opt/intel/oneapi/mkl/2024.0/lib:/opt/intel/oneapi/ippcp/2021.9/lib/:/opt/intel/oneapi/ipp/2021.10/lib:/opt/intel/oneapi/dpl/2022.3/lib:/opt/intel/oneapi/dnnl/2024.0/lib:/opt/intel/oneapi/debugger/2024.0/opt/debugger/lib:/opt/intel/oneapi/dal/2024.0/lib:/opt/intel/oneapi/compiler/2024.0/opt/oclfpga/host/linux64/lib:/opt/intel/oneapi/compiler/2024.0/opt/compiler/lib:/opt/intel/oneapi/compiler/2024.0/lib:/opt/intel/oneapi/ccl/2021.11/lib/ VTUNE_PROFILER_DIR=/opt/intel/oneapi/vtune/2024.0 XDG_RUNTIME_DIR=/run/user/1000 SSH_CLIENT=127.0.0.1 56490 22 CONDA_DEFAULT_ENV=pt MKLROOT=/opt/intel/oneapi/mkl/2024.0 LC_TIME=zh_CN.UTF-8 DAL_MINOR_BINARY=0 XDG_DATA_DIRS=/usr/share/gnome:/usr/local/share:/usr/share:/var/lib/snapd/desktop NLSPATH=/opt/intel/oneapi/mkl/2024.0/share/locale/%l_%t/%N:/opt/intel/oneapi/compiler/2024.0/lib/locale/%l_%t/%N PATH=/opt/intel/oneapi/vtune/2024.0/bin64:/opt/intel/oneapi/mpi/2021.11/opt/mpi/libfabric/bin:/opt/intel/oneapi/mpi/2021.11/bin:/opt/intel/oneapi/mkl/2024.0/bin/:/opt/intel/oneapi/dpcpp-ct/2024.0/bin:/opt/intel/oneapi/dev-utilities/2024.0/bin:/opt/intel/oneapi/debugger/2024.0/opt/debugger/bin:/opt/intel/oneapi/compiler/2024.0/opt/oclfpga/bin:/opt/intel/oneapi/compiler/2024.0/bin:/opt/intel/oneapi/advisor/2024.0/bin64:/home/dyedd/.conda/envs/pt/bin:/usr/local/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin SYCL_CACHE_PERSISTENT=1 INTEL_PYTHONHOME=/opt/intel/oneapi/debugger/2024.0/opt/debugger DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus SSH_TTY=/dev/pts/4 CONDA_PREFIX_1=/usr/local/miniconda3 CPATH=/opt/intel/oneapi/tbb/2021.11/env/../include:/opt/intel/oneapi/mpi/2021.11/include:/opt/intel/oneapi/mkl/2024.0/include:/opt/intel/oneapi/ippcp/2021.9/include:/opt/intel/oneapi/ipp/2021.10/include:/opt/intel/oneapi/dpl/2022.3/include:/opt/intel/oneapi/dpcpp-ct/2024.0/include:/opt/intel/oneapi/dnnl/2024.0/include:/opt/intel/oneapi/dev-utilities/2024.0/include:/opt/intel/oneapi/dal/2024.0/include/dal:/opt/intel/oneapi/compiler/2024.0/opt/oclfpga/include:/opt/intel/oneapi/ccl/2021.11/include LC_NUMERIC=zh_CN.UTF-8 OLDPWD=/home/dyedd _=/usr/bin/printenv ----------------------------------------------------------------- xpu-smi is properly installed. ----------------------------------------------------------------- +-----------+--------------------------------------------------------------------------------------+ | Device ID | Device Information | +-----------+--------------------------------------------------------------------------------------+ | 0 | Device Name: Intel(R) Arc(TM) A770 Graphics | | | Vendor Name: Intel(R) Corporation | | | SOC UUID: 00000000-0000-0003-0000-000856a08086 | | | PCI BDF Address: 0000:03:00.0 | | | DRM Device: /dev/dri/card0 | | | Function Type: physical | +-----------+--------------------------------------------------------------------------------------+ ----------------------------------------------------------------- ```
Zhangky11 commented 5 months ago

Sorry, we still can't reproduce the error you encountered while running chatglm3/streamchat.py. The error you mentioned is OUT_OF_HOST_MEMORY, indicating a memory overflow on the CPU, but you are actually running model inference on XPU. Therefore, could you please provide further details on the input parameters you used when running chatglm3/streamchat.py, including the question prompt, max_new_token, etc., so that we can further replicate the issue?

dyedd commented 5 months ago

Sorry, we still can't reproduce the error you encountered while running chatglm3/streamchat.py. The error you mentioned is OUT_OF_HOST_MEMORY, indicating a memory overflow on the CPU, but you are actually running model inference on XPU. Therefore, could you please provide further details on the input parameters you used when running chatglm3/streamchat.py, including the question prompt, max_new_token, etc., so that we can further replicate the issue?

My config is default,if you can‘t reproduce the error,I can provide ssh information to you?

Zhangky11 commented 5 months ago

Sure, you could leave your email address and I'll contact you.

dyedd commented 5 months ago

Sure, you could leave your email address and I'll contact you.


Zhangky11 commented 5 months ago

Please ensure that the modeling file for chatglm3 is downloaded from the official repository. You could go to ModelScope to download the corresponding file.

dyedd commented 5 months ago

Please ensure that the modeling file for chatglm3 is downloaded from the official repository. You could go to ModelScope to download the corresponding file.

Thanks, maybe I downloaded the basic model instead of the chat model