intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.25k stars 1.22k forks source link

Only 1 arc700 worked when running "finetune_llama2_7b_arc_2_card.sh" with 2 arc770 in workstation. #9677

Open liang1wang opened 6 months ago

liang1wang commented 6 months ago

Sample: https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/QLoRA-FineTuning/alpaca-qlora/finetune_llama2_7b_arc_2_card.sh Env: Intel(R) Xeon(R) w7-3455 2 ARC770 ubuntu22.04, kernel - 6.2.0 mem-125G oneAPI 23.2.0 Model: Llama-2-7b-hf image image

plusbang commented 6 months ago

Hi, @liang1wang , according to your screenshot, the used GPU memory is 11518M in your first card, and the used GPU memory is 11560M in your second card. Two Arc770 were worked.