intel-analytics ipex-llm issues

intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc

Apache License 2.0

6.75k stars 1.27k forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[NPU] Support GW for NPU C++

#12450 rnwang04 closed 5 hours ago
0
[NPU] further fix of qwen2 int8 pipeline & C++

#12449 rnwang04 closed 6 hours ago
0
Request to upgrade "Langchain-Chatchat" based on the latest version in github.

#12448 liang1wang opened 8 hours ago
0
[Test] Build vllm image to test prefix caching

#12447 hzjane opened 8 hours ago
0
[NPU] Fix abnormal output for Qwen2-7B when CW `sym_int8`

#12446 Oscilloscope98 closed 7 hours ago
0
Remove BIGDL_LLM_XMX_DISABLED in documentation

#12445 cranechu0131 opened 10 hours ago
0
add sdxl and lora-lcm optimization

#12444 JinheTang closed 11 hours ago
0
Optimize first token of C++ NPU by adding npu_dpu_groups

#12443 rnwang04 closed 11 hours ago
0
Add support of llama3.2 for NPU C++

#12442 rnwang04 closed 13 hours ago
0
optimize sdxl again

#12441 MeouSker77 closed 1 day ago
0
Inference is exceptionally slow on the L20 GPU

#12440 joey9503 opened 1 day ago
1
small change

#12439 MeouSker77 closed 1 day ago
0
Support qwen2.5 3B for NPU & update related examples

#12438 rnwang04 closed 1 day ago
0
add chinese prompt troubleshooting for npu cpp examples

#12437 JinheTang closed 1 day ago
0
fix and optimize sd

#12436 MeouSker77 closed 1 day ago
0
Kernel NULL pointer dereference in i915 driver

#12435 luhuaei opened 1 day ago
1
Support minicpm for NPU C++

#12434 rnwang04 closed 1 day ago
0
update serving image runtime

#12433 pepijndevos closed 8 hours ago
8
Unable to inference with Qwen2.5 GPTQ model

#12432 notsyncing closed 2 days ago
3
support Llama2-7B / Llama3-8B for NPU C++

#12431 rnwang04 closed 4 days ago
0
New convert support for C++ NPU

#12430 rnwang04 closed 4 days ago
0
Update english prompt to 34k in vllm_online_benchmark.py

#12429 liu-shaojun closed 4 days ago
0
Error loading for file torch\lib\backend_with_compiler.dll

#12428 LiangtaoJin closed 4 days ago
1
nf4 still unsupported?

#12427 epage480 opened 5 days ago
1
Disable XMX

#12426 NikosDi opened 5 days ago
4
small fix

#12425 rnwang04 closed 5 days ago
0
Upgrade dependency for Windows LNL/ARL support

#12424 Oscilloscope98 closed 5 days ago
0
add optimization to openjourney

#12423 JinheTang closed 5 days ago
0
Add release support for option `xpu_arc`

#12422 Oscilloscope98 opened 5 days ago
0
update batch kernel condition

#12421 MeouSker77 closed 5 days ago
0
Error: llama runner process has terminated: error loading model: No device of requested type available

#12420 fanlessfan opened 5 days ago
10
Optimize with new batch kernel when `batch_size=1` on LNL

#12419 Oscilloscope98 closed 5 days ago
0
add Stable diffusion examples

#12418 JinheTang closed 6 days ago
0
Initial NPU C++ Example

#12417 rnwang04 closed 5 days ago
0
Fix speech_paraformer issue with unexpected changes

#12416 sgwhat closed 1 week ago
2
Add multimodal benchmark

#12415 hzjane closed 6 days ago
0
Update benchmark_vllm_throughput.py

#12414 gc-fu closed 1 week ago
0
docs: add Japanese README

#12413 eltociear opened 1 week ago
0
'AutoModel' object has no attribute 'config' when using Speech_Paraformer-Large on NPU

#12412 fanyhchn opened 1 week ago
1
Update Ollama with IPEX-LLM to a newer version

#12411 NikosDi opened 1 week ago
1
[NPU] GW prefill merge qkv

#12410 cyita opened 1 week ago
0
Add install_windows_gpu.zh-CN.md and install_linux_gpu.zh-CN.md

#12409 joan726 closed 1 week ago
0
update batch kernel condition

#12408 MeouSker77 closed 1 week ago
0
fix again

#12407 rnwang04 closed 1 week ago
0
fix workflow again

#12406 rnwang04 closed 1 week ago
0
Tiny doc fix

#12405 Oscilloscope98 closed 1 week ago
1
Fix npu pipeline release workflow

#12404 rnwang04 closed 1 week ago
0
Path of models using Ollama with IPEX-LLM (Windows)

#12403 NikosDi closed 1 week ago
4
[NPU] dump prefill IR for further C++ solution

#12402 rnwang04 closed 6 days ago
1
Support performance mode of GLM4 model

#12401 Oscilloscope98 closed 1 week ago
0