intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.55k stars 1.25k forks source link

Need upgrade transformers to >=v4.35 to fix the issue "No module named 'transformers.modeling_attn_mask_utils" #10439

Open oldmikeyang opened 6 months ago

oldmikeyang commented 6 months ago

2024-03-16 11:17:44,339 - INFO - Converting the current model to bf16 format...... 2024-03-16 11:17:44,339 - INFO - BIGDL_OPT_IPEX: True Traceback (most recent call last): File "/home/llm/BigDL/python/llm/dev/benchmark/all-in-one/./run.py", line 1365, in run_model(model, api, in_out_pairs, conf['local_model_hub'], conf['warm_up'], conf['num_trials'], conf['num_beams'], File "/home/llm/BigDL/python/llm/dev/benchmark/all-in-one/./run.py", line 96, in run_model result = run_bigdl_ipex_bf16(repo_id, local_model_hub, in_out_pairs, warm_up, num_trials, num_beams, batch_size) File "/home/llm/BigDL/python/llm/dev/benchmark/all-in-one/./run.py", line 1108, in run_bigdl_ipex_bf16 model = AutoModelForCausalLM.from_pretrained(model_path, load_in_low_bit='bf16', trust_remote_code=True, torch_dtype=torch.bfloat16, File "/home/llm/miniconda3/envs/bigdl-cpu/lib/python3.9/site-packages/bigdl/llm/transformers/model.py", line 304, in from_pretrained model = cls.load_convert(q_k, optimize_model, *args, **kwargs) File "/home/llm/miniconda3/envs/bigdl-cpu/lib/python3.9/site-packages/bigdl/llm/transformers/model.py", line 425, in load_convert model = ggml_convert_low_bit(model, qtype, optimize_model, File "/home/llm/miniconda3/envs/bigdl-cpu/lib/python3.9/site-packages/bigdl/llm/transformers/convert.py", line 655, in ggml_convert_low_bit model = _optimize_ipex(model, qtype) File "/home/llm/miniconda3/envs/bigdl-cpu/lib/python3.9/site-packages/bigdl/llm/transformers/convert.py", line 733, in _optimize_ipex from transformers.modeling_attn_mask_utils import AttentionMaskConverter ModuleNotFoundError: No module named 'transformers.modeling_attn_mask_utils'

Zephyr596 commented 6 months ago

Thank you for reaching out. Could you please specify which version of Transformers you are using?

Zephyr596 commented 6 months ago

What model are you using?

xiangyuT commented 6 months ago

We suggest transformers==4.35.2/4.36.2 if you want to accelerate inference with IPEX==v2.2.0+cpu.