Update Ollama with IPEX-LLM to a newer version

intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc

Apache License 2.0

6.75k stars 1.27k forks source link

Hello.

It seems that the latest Ollama with IPEX-LLM version (0.3.6) is a little old nowadays.

It doesn't have proper support for new and popular models like:

1) Phi 3.5 2) Qwen 2.5 3) Llama 3.2 4) Llama 3.2-vision

I tried the first two (Phi 3.5 and Qwen 2.5) with Intel's Ollama (IPEX-LLM) and it seems that they produce strange results. Especially Phi 3.5produces gibberish output.

Also, newer versions have fixed bugs, support new very useful commands and are faster (CPU optimizations) e.g

Bug fix: Fixed issue where setting OLLAMA_NUM_PARALLEL would cause models to be reloaded on lower VRAM systems

New (very useful) command: New ollama stop command to unload a running model.

thank you

intel-analytics / ipex-llm

Update Ollama with IPEX-LLM to a newer version #12411