intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Apache License 2.0
6.69k stars 1.26k forks source link

Run llama2 on windows A750 failed: No module named 'linear_fp16_esimd' #10698

Closed qiuxin2012 closed 6 months ago

qiuxin2012 commented 7 months ago

Get below error:

  File "C:\Users\arda\miniconda3\envs\xin-llm\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\arda\miniconda3\envs\xin-llm\lib\site-packages\ipex_llm\transformers\models\llama.py", line 320, in llama_attention_forward_4_31
    return forward_function(
  File "C:\Users\arda\miniconda3\envs\xin-llm\lib\site-packages\ipex_llm\transformers\models\llama.py", line 642, in llama_attention_forward_4_31_original
    use_esimd_sdp(q_len, key_states.shape[2], self.head_dim, query_states, attention_mask):
  File "C:\Users\arda\miniconda3\envs\xin-llm\lib\site-packages\ipex_llm\transformers\models\utils.py", line 336, in use_esimd_sdp
    import linear_fp16_esimd
ModuleNotFoundError: No module named 'linear_fp16_esimd'
Oscilloscope98 commented 7 months ago

This will be supported in https://github.com/intel-analytics/ipex-llm/pull/10705 and https://github.com/intel-analytics/llm.cpp/pull/333