issues
search
kvcache-ai
/
ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Apache License 2.0
741
stars
39
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
:zap: rm opt config path default value and fix some config logic bug
#115
KMSorSMS
closed
1 week ago
0
复现InternLM2.5-7B-Chat-1M报错
#114
Cherishyt
opened
1 week ago
0
RuntimeError CUDA error when running Infinite Bench
#113
Flitieter
opened
2 weeks ago
0
硬件配置支持
#112
Cherishyt
opened
2 weeks ago
1
Support for New SOTA MoE: tencent/Tencent-Hunyuan-Large
#111
ThomasBaruzier
opened
2 weeks ago
0
refactor local_chat & config setting
#110
KMSorSMS
closed
2 weeks ago
0
install error on windows, need help
#109
gaowayne
opened
3 weeks ago
0
Detailed specification of the computer hardware to run 236B DeepSeek-Coder-V2
#108
atomlayer
opened
3 weeks ago
1
feature request: support internvl2
#107
kolinfluence
opened
3 weeks ago
0
[Fix] Fix readme structure.
#106
Azure-Tang
closed
3 weeks ago
0
how to implement new algorithm in this repo?
#105
lumiere-ml
closed
3 weeks ago
1
Attempting to increase output to 16k results in crash during output
#104
bitbottrap
opened
1 month ago
1
How to infer quantized models on CPU&GPU
#103
shuzhang-pku
closed
1 month ago
1
Error loading model: token_embd.weight not found in GGUF file
#102
antonovkz
opened
1 month ago
1
Long prompt with DeepSeek crashing with tensor size mismatch
#101
bitbottrap
opened
1 month ago
11
Does ktransformers support deepseek V2.5?
#100
huliangbing
closed
1 month ago
2
Adapt Windows
#99
chenht2022
closed
1 month ago
0
Error Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
#96
drrros
opened
1 month ago
5
Suggestion to add DeepSeek v2.5 support
#95
arisau
closed
2 months ago
4
ImportError: DLL load failed while importing KTransformersOps: The specified module was not found.
#94
SCP12rs
opened
2 months ago
7
Deepseekv2推理速度很慢,看样子似乎在cpu上做推理,gpu利用率很低
#93
Chain-Mao
closed
1 month ago
3
Specify MAX_NEW_TOKENS for ktransformers server
#92
arthurv
opened
2 months ago
2
How can I use opencompass benchmark tools to test ktransformers in long context?
#91
AsVoider
opened
2 months ago
1
Installation Problem
#90
Chain-Mao
closed
2 months ago
1
Installation requirements
#89
arthurv
opened
2 months ago
4
[fix] Fix some gpu dequant function doesn't support multi gpu bug
#88
Azure-Tang
closed
2 months ago
0
are marline and q4k totally equivalent?
#87
Eutenacity
closed
1 month ago
5
typo fix: KMisrtal -> KMistral
#86
xhedit
closed
2 months ago
0
Getting reasonable performance on dual RTX 3090 and 128gb
#85
trilog-inc
opened
2 months ago
7
可以给出详细的硬件配置清单吗?
#84
qixing-ai
opened
2 months ago
2
Use cond var to avoid busy loop
#83
sayap
closed
1 month ago
1
Seg Fault on long replies
#82
matthusby
closed
2 months ago
2
Fix backend
#81
chenht2022
closed
2 months ago
0
Busy loop in cpu_backend/task_queue.cpp keeps 1 thread at 100% CPU when queue is empty
#80
sayap
closed
1 month ago
5
Is deepseek-ai/DeepSeek-V2.5 supported?
#79
AshD
closed
2 months ago
9
Fix: Wrong type of token list returned by prefill_and_generate
#77
TKONIY
closed
1 month ago
0
8-GPU configuration on L40 OOM
#76
fengyang95
closed
2 months ago
8
How can i run internlm2_5-7b-chat-1m in ktransformers?
#74
Ma1oneZhang
closed
2 months ago
4
When the input token exceeds 4096, an error will occur.
#73
fengyang95
closed
2 months ago
4
Support IQ4_XS dequantize
#72
sayap
closed
2 months ago
4
[fix] Fix qlen > chunk_size mask is none error
#71
Azure-Tang
closed
2 months ago
0
UnboundLocalError: cannot access local variable 'chunck_mask' where it is not associated with a value
#70
fengyang95
closed
2 months ago
2
Missing pip packages flash_attn and wheel
#69
bitbottrap
closed
2 months ago
2
What is the maximum input token size supported for DeepSeek V2?
#68
fengyang95
closed
2 months ago
1
[fix] fix bugs about Qwen2-57B, install requirement, DockerFile
#67
UnicornChan
closed
2 months ago
0
docker container fails to start due to missing package 'uvicorn'
#66
sammcj
closed
2 months ago
1
Would you support glm4-chat-1m
#65
choyakawa
opened
2 months ago
1
docs: update long_context_introduction.md
#64
eltociear
closed
2 months ago
0
[Fix] Fix problem that ktransformers cannot offload whole layer in cpu
#62
Azure-Tang
closed
2 months ago
0
docker builds and pip install broken - No module named 'cpufeature'
#61
sammcj
closed
2 months ago
5
Next