kvcache-ai ktransformers issues

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Apache License 2.0

741 stars 39 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

:zap: rm opt config path default value and fix some config logic bug

#115 KMSorSMS closed 1 week ago
0
复现InternLM2.5-7B-Chat-1M报错

#114 Cherishyt opened 1 week ago
0
RuntimeError CUDA error when running Infinite Bench

#113 Flitieter opened 2 weeks ago
0
硬件配置支持

#112 Cherishyt opened 2 weeks ago
1
Support for New SOTA MoE: tencent/Tencent-Hunyuan-Large

#111 ThomasBaruzier opened 2 weeks ago
0
refactor local_chat & config setting

#110 KMSorSMS closed 2 weeks ago
0
install error on windows, need help

#109 gaowayne opened 3 weeks ago
0
Detailed specification of the computer hardware to run 236B DeepSeek-Coder-V2

#108 atomlayer opened 3 weeks ago
1
feature request: support internvl2

#107 kolinfluence opened 3 weeks ago
0
[Fix] Fix readme structure.

#106 Azure-Tang closed 3 weeks ago
0
how to implement new algorithm in this repo?

#105 lumiere-ml closed 3 weeks ago
1
Attempting to increase output to 16k results in crash during output

#104 bitbottrap opened 1 month ago
1
How to infer quantized models on CPU&GPU

#103 shuzhang-pku closed 1 month ago
1
Error loading model: token_embd.weight not found in GGUF file

#102 antonovkz opened 1 month ago
1
Long prompt with DeepSeek crashing with tensor size mismatch

#101 bitbottrap opened 1 month ago
11
Does ktransformers support deepseek V2.5？

#100 huliangbing closed 1 month ago
2
Adapt Windows

#99 chenht2022 closed 1 month ago
0
Error Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

#96 drrros opened 1 month ago
5
Suggestion to add DeepSeek v2.5 support

#95 arisau closed 2 months ago
4
ImportError: DLL load failed while importing KTransformersOps: The specified module was not found.

#94 SCP12rs opened 2 months ago
7
Deepseekv2推理速度很慢，看样子似乎在cpu上做推理，gpu利用率很低

#93 Chain-Mao closed 1 month ago
3
Specify MAX_NEW_TOKENS for ktransformers server

#92 arthurv opened 2 months ago
2
How can I use opencompass benchmark tools to test ktransformers in long context?

#91 AsVoider opened 2 months ago
1
Installation Problem

#90 Chain-Mao closed 2 months ago
1
Installation requirements

#89 arthurv opened 2 months ago
4
[fix] Fix some gpu dequant function doesn't support multi gpu bug

#88 Azure-Tang closed 2 months ago
0
are marline and q4k totally equivalent?

#87 Eutenacity closed 1 month ago
5
typo fix: KMisrtal -> KMistral

#86 xhedit closed 2 months ago
0
Getting reasonable performance on dual RTX 3090 and 128gb

#85 trilog-inc opened 2 months ago
7
可以给出详细的硬件配置清单吗?

#84 qixing-ai opened 2 months ago
2
Use cond var to avoid busy loop

#83 sayap closed 1 month ago
1
Seg Fault on long replies

#82 matthusby closed 2 months ago
2
Fix backend

#81 chenht2022 closed 2 months ago
0
Busy loop in cpu_backend/task_queue.cpp keeps 1 thread at 100% CPU when queue is empty

#80 sayap closed 1 month ago
5
Is deepseek-ai/DeepSeek-V2.5 supported?

#79 AshD closed 2 months ago
9
Fix: Wrong type of token list returned by prefill_and_generate

#77 TKONIY closed 1 month ago
0
8-GPU configuration on L40 OOM

#76 fengyang95 closed 2 months ago
8
How can i run internlm2_5-7b-chat-1m in ktransformers?

#74 Ma1oneZhang closed 2 months ago
4
When the input token exceeds 4096, an error will occur.

#73 fengyang95 closed 2 months ago
4
Support IQ4_XS dequantize

#72 sayap closed 2 months ago
4
[fix] Fix qlen > chunk_size mask is none error

#71 Azure-Tang closed 2 months ago
0
UnboundLocalError: cannot access local variable 'chunck_mask' where it is not associated with a value

#70 fengyang95 closed 2 months ago
2
Missing pip packages flash_attn and wheel

#69 bitbottrap closed 2 months ago
2
What is the maximum input token size supported for DeepSeek V2?

#68 fengyang95 closed 2 months ago
1
[fix] fix bugs about Qwen2-57B, install requirement, DockerFile

#67 UnicornChan closed 2 months ago
0
docker container fails to start due to missing package 'uvicorn'

#66 sammcj closed 2 months ago
1
Would you support glm4-chat-1m

#65 choyakawa opened 2 months ago
1
docs: update long_context_introduction.md

#64 eltociear closed 2 months ago
0
[Fix] Fix problem that ktransformers cannot offload whole layer in cpu

#62 Azure-Tang closed 2 months ago
0
docker builds and pip install broken - No module named 'cpufeature'

#61 sammcj closed 2 months ago
5