intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.55k stars 1.25k forks source link

about the bug #10429

Open K-Alex13 opened 6 months ago

K-Alex13 commented 6 months ago

from bigdl.llm.transformers import AutoModelForCausalLM import torch.nn.utils.prune as prune model_path = r'D:\test_bigdl\model\Baichuan2-7B-Chat'

model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, optimize_model=True, use_cache=True, cpu_embedding=True, load_in_4bit=True).bfloat16().eval()

model = model.to('xpu')

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) return model,tokenizer

i try to use this method to initial model. The model can be used this way, however after maybe 20 or 30 dialogue the model will stop without any error. What is wrong here? If I want to use it without this problem, what can I do.

liu-shaojun commented 6 months ago

Hi, Which platform do you use? Windows IGPU, Arc, or PVC?

K-Alex13 commented 6 months ago

use window system with Arc 770

liu-shaojun commented 6 months ago

Can you share your complete code? So that we can try to reproduce this issue from our side.

Also please provide torch version, oneapi version, driver version and bigdl-llm version.

K-Alex13 commented 6 months ago

oneapi version 2024, I download bigdl-llm around 2023 1.15 not sure the version. torch version 2.1.0a0+cxx11.abi ,driver arc a770.
response = model.chat(tokenizer, message,stream=True) use this code to chat and the prompt is around 800words. The model initial code is the above code

liu-shaojun commented 6 months ago

You can acquire the bigdl-llm version and driver version through the following commands: bigdl-llm: pip list | grep bigdl-llm driver arc: sycl-ls

Also, could you please try the lastest bigdl-llm? 2023.01.15 is a very old version, could you try the latest bigdl-llm through the link? pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu

K-Alex13 commented 6 months ago

I can try a new version in this weekend thank you