Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
/scala/serving/pom.xml
/scala/serving/pom.xml
/scala/ppml/pom.xml
/scala/ppml/pom.xml
/scala/orca/pom.xml
/scala/friesian/pom.xml
/scala/friesian/pom.xml
/scala/friesian/pom.xml
/scala/dllib/pom.xml
/scala/pom.xml