intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.44k stars 1.24k forks source link

[arc 770][Qwen Int4] the QWen Examle #9602

Closed xiguiw closed 8 months ago

xiguiw commented 8 months ago

Arc 770,

Follow readme.md install the environment, https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen

run python generate.py, Got error:

2023-12-05 16:59:05,982 - bigdl.llm.transformers.utils - INFO - Converting the current model to sym_int4 format.
Traceback (most recent call last):
  File "/home/mengniwa/xwang/bigdl-gpu-int4-inf.py", line 84, in <module>
    output = model.generate(input_ids,
  File "/home/mengniwa/.cache/huggingface/modules/transformers_modules/Qwen/Qwen-7B-Chat/9cae423eb1641ff2c2f1515aac08bcfcc8428b01/modeling_qwen.py", line 1261, in generate
    return super().generate(
  File "/home/mengniwa/miniconda3/envs/bigdl-llm/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/mengniwa/miniconda3/envs/bigdl-llm/lib/python3.9/site-packages/transformers/generation/utils.py", line 1588, in generate
    return self.sample(
  File "/home/mengniwa/miniconda3/envs/bigdl-llm/lib/python3.9/site-packages/transformers/generation/utils.py", line 2642, in sample
    outputs = self(
  File "/home/mengniwa/miniconda3/envs/bigdl-llm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/mengniwa/.cache/huggingface/modules/transformers_modules/Qwen/Qwen-7B-Chat/9cae423eb1641ff2c2f1515aac08bcfcc8428b01/modeling_qwen.py", line 1045, in forward
    transformer_outputs = self.transformer(
  File "/home/mengniwa/miniconda3/envs/bigdl-llm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/mengniwa/.cache/huggingface/modules/transformers_modules/Qwen/Qwen-7B-Chat/9cae423eb1641ff2c2f1515aac08bcfcc8428b01/modeling_qwen.py", line 893, in forward
    outputs = block(
  File "/home/mengniwa/miniconda3/envs/bigdl-llm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/mengniwa/.cache/huggingface/modules/transformers_modules/Qwen/Qwen-7B-Chat/9cae423eb1641ff2c2f1515aac08bcfcc8428b01/modeling_qwen.py", line 612, in forward
    attn_outputs = self.attn(
  File "/home/mengniwa/miniconda3/envs/bigdl-llm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/mengniwa/miniconda3/envs/bigdl-llm/lib/python3.9/site-packages/bigdl/llm/transformers/models/qwen.py", line 192, in qwen_attention_forward
    attn_output, attn_weight = self._attn(
  File "/home/mengniwa/.cache/huggingface/modules/transformers_modules/Qwen/Qwen-7B-Chat/9cae423eb1641ff2c2f1515aac08bcfcc8428b01/modeling_qwen.py", line 352, in _attn
    attn_weights = torch.where(
RuntimeError: The size of tensor a (19) must match the size of tensor b (10) at non-singleton dimension 2

change optimize_model=True to False, it can run successfully.

model = AutoModelForCausalLM.from_pretrained(model_path, load_in_4bit=True, optimize_model=False, trust_remote_code=True, use_cache=True)

bigdl-core-xe 2.5.0b20231202 pypi_0 pypi bigdl-core-xe-esimd 2.5.0b20231202 pypi_0 pypi bigdl-llm 2.5.0b20231202 pypi_0 pypi intel-extension-for-pytorch 2.0.110+xpu
transformers 4.31.0 pypi_0 pypi

qiyuangong commented 8 months ago

Related to https://github.com/intel-analytics/BigDL/issues/9582#issuecomment-1840145208

Please try bigdl-llm[xpu] 2.5.0b20231205

pip install bigdl-llm[xpu] --pre --upgrade -f https://pypi.org/simple
xiguiw commented 8 months ago

Related to #9582 (comment)

Please try bigdl-llm 2.5.0b20231205

pip install bigdl-llm --pre --upgrade -f https://pypi.org/simple

Great, It works! Thanks a lot!