intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.25k stars 1.22k forks source link

core dump in ARC gpu while running phi2 and mistral #10553

Open tsantra opened 3 months ago

tsantra commented 3 months ago

I am running mistral 7b and phi 2 in ARC GPU and getting a core dump error. I have converted the model into lower precision (int4) and saved it. And then loading the int4 model in the GPU.

The same converted model I am being able to run successfully in the CPU. I have attached the screenshots.

lower_precision_cpu_success lower_precision_gpu_failure lower_precision_gpu_phi2_failure lower_precision_phi2_cpu_succesful

I have used the code from below link to convert and save in int4. ipex-llm/python/llm/example/CPU/HF-Transformers-AutoModels/Save-Load/

Added the log file from running the script: ipex-llm/python/llm/scripts/env-check.sh log_arc_issue.txt

Please help.

Thanks, Titir

leonardozcm commented 3 months ago

hi @tsantra , I am sorry that I can not reproduce this issue on MTL linux igpu. BTW, You say you are using an ARC GPU, you mean MTL igpu(also be marked as ARC in system) or a real dgpu ARC like A770?

Mistral

(changmin-llm) arda@xiaoxin04-ubuntu:~/changmin$ python mistral.py 
/home/arda/miniconda3/envs/changmin-llm/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
2024-03-29 08:44:13,907 - INFO - intel_extension_for_pytorch auto imported
2024-03-29 08:44:14,206 - INFO - Converting the current model to sym_int4 format......
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.

-------------------- Output --------------------
[INST] What is AI? [/INST] AI stands for Artificial Intelligence. It is a branch of computer science that focuses on the development of intelligent machines that work, react, and even think like humans

Phi-2

(changmin-llm) arda@xiaoxin04-ubuntu:~/changmin$ python phi.py 
/home/arda/miniconda3/envs/changmin-llm/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
2024-03-29 09:04:56,248 - INFO - intel_extension_for_pytorch auto imported
2024-03-29 09:04:56,320 - INFO - Converting the current model to sym_int4 format......
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Inference time: 2.047560930252075 s
-------------------- Prompt --------------------
 Question:What is AI?

 Answer:
-------------------- Output --------------------
 Question:What is AI?

 Answer: AI stands for Artificial Intelligence. It is a field of computer science that focuses on creating intelligent machines that can perform tasks that would typically require human intelligence.

Both of the models are loaded with load_low_bit, and IPEX-LLM==0327

Somehow I didn't see any driver info from your error log, so you may update your system and python environment, and see if this still fail.