Open KiwiHana opened 10 months ago
I can't reproduce your error in latest 2.5.0b20231205 bigdl-core-xe and bigdl-llm Could you share your os, driver, oneapi version?
My machine is ubuntu 22.04.3 with Linux 5.19.0-41-generic kernel, driver version ishttps://dgpu-docs.intel.com/releases/stable_736_25_20231031.html,oneapi is 2023.2.0。 You can see our recommended requirements here: https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#recommended-requirements
Please make sure you qwen model is updated to the version of October 12th.
Qwen-7B-Chat Upgrade to bigdl-llm 2.5.0b20231205
//Case 1 use model.half().to('xpu')
model.first_cost, model.rest_cost_mean 0.13400719300011588 0.030304098225813855 input length is: torch.Size([1, 1987]) model generate cost: 1.3971823479998875 actual_out_len 2
and gpu memory costs about 13440.93
//Case 2 use model.to('xpu')
model.first_cost, model.rest_cost_mean 0.1322723490000044 0.03364436958065286 input length is: torch.Size([1, 1987]) model generate cost: 1.1173496979999982 actual_out_len 2
and gpu memory costs about 10567.20
test script: bigdl all-in-one/run-arc.sh use model.half().to("xpu") instead of model.to("xpu") input prompt: 2048 .txt output 1024 token