intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.27k stars 1.23k forks source link

Determining if AMX is in use by ollama #11496

Open js333031 opened 6 days ago

js333031 commented 6 days ago

Hello, I used latest steps to install ipex-llm into a venv on a 5th Gen Xeon system. I don't think AMX is being utilized based on screenshot below. Should AMX show up in list of CPU features in output below - last line? lscpu shows AMX instructions are present. Can you please show a verification step for ipex-llm confirming that CPU optimizations are installed in ipex-llm venv?

image

Thanks

sgwhat commented 5 days ago

Hi @js333031 , ipex-llm ollama is a GPU-optimized version, so we currently do not plan to work on this.