issues
search
SJTU-IPADS
/
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MIT License
7.97k
stars
415
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
注意需要修改llama.cpp中的激活函数,从Gelu改为Relu
#230
eraser333
closed
3 weeks ago
0
Error: the provided PTX was compiled with an unsupported toolchain
#229
jiangzizi
opened
3 weeks ago
0
about the use of OPT model
#228
bobzhang208
opened
1 month ago
0
add new model in power-infer2
#227
Francis235
opened
1 month ago
1
Qualcomm chips support
#226
Francis235
opened
1 month ago
0
Question about the perplexity
#225
eljrte
opened
1 month ago
0
关于注意力块权重如何分配?
#224
Yues007
opened
1 month ago
0
请问我该如何获得opt模型相关的weight文件?
#223
a1bc2def6g
opened
2 months ago
1
What does "co-activation" mean in Section 4.3 of the PowerInfer-2 paper?
#222
exhyy
closed
1 month ago
0
关于README视频Demo的问题
#221
lyeXzot
closed
1 month ago
0
统计predictor的overhead
#220
guanchenl
opened
2 months ago
0
Help! Want a toy example to run matmul with q40 weight by cuda kernel
#219
Eutenacity
opened
2 months ago
0
CUDA toolkit version?
#218
shujiehan
opened
2 months ago
1
Fix segmentation fault for models exceeding 40B on AMD GPUs & optimize mul_mat_axpy operation
#217
Tworan
closed
2 months ago
0
Am i doing something wrong?
#216
RealMrCactus
opened
2 months ago
1
有微信或QQ或其他交流群或者打算开一个吗?
#215
lzcchl
opened
3 months ago
0
.generated.gpuidx 是在用 huggingface-cli 命令下载模型的时候自动生成的吗?有没用别的办法获取?
#214
lzcchl
closed
3 months ago
1
Some question about Fig4.
#213
rhmaaa
opened
4 months ago
5
Support converting TurboSparse mistral model into PowerInfer GGUF
#212
hodlen
opened
4 months ago
0
我要如何获得预测文件呢
#211
LDLINGLINGLING
opened
4 months ago
1
Feature request : Support for PHI3 mini
#210
raymond-infinitecode
opened
4 months ago
0
请问powerinfer能否兼容llama.cpp的模型呢
#209
mailonghua
opened
4 months ago
0
the output for Q4_gguf is strange again!!
#208
milktea888
opened
4 months ago
1
About powerinfer-2
#207
Ther-nullptr
opened
4 months ago
0
Can Powerinfer run on CPUONLY?
#206
0wwafa
closed
4 months ago
1
add convert-hf-to-powerinfer-gguf.py to CMakeLists.txt
#205
MatthewCroughan
closed
4 months ago
1
minor for readme
#204
YixinSong-e
closed
5 months ago
0
Where is the TurboSparse-Mixtral mlp_predictor?
#203
MatthewCroughan
opened
5 months ago
1
请问能和vllm共同使用吗
#202
yadandan
opened
5 months ago
0
How to convert ProSparse-LLaMA-2-13B model to .gguf?
#201
Graysonicc
opened
5 months ago
3
代码中使用的llama.cpp版本
#200
weizhenhuan
closed
5 months ago
0
windows下cmake编译失败
#199
codetown
opened
5 months ago
0
为什么执行过程中没有gpu offload,这边执行的gpu为a100 80G,
#198
qw1319
closed
5 months ago
2
How can i convert llama-3 8b and 70b in GGUF model ?
#197
tankvpython
opened
5 months ago
0
支持的量化类型
#196
deleteeeee
closed
1 month ago
1
How to calculate the number of activated params in TurboSparse paper?
#195
ustcwhy
closed
5 months ago
2
Source for v2 (mobile inference engine)
#194
peeteeman
opened
5 months ago
7
minor in README
#193
YixinSong-e
closed
5 months ago
0
add news
#192
YixinSong-e
closed
5 months ago
0
ggml-cuda.cu:8949: invalid argument无效参数问题
#191
NeverGpDzy
closed
5 months ago
2
推理报错
#190
deleteeeee
closed
5 months ago
0
ReluFalcon 40B 在llama.cpp上无效输出
#189
Zctoylm0927
closed
5 months ago
4
Need quite a long time to load the model
#188
meicale
opened
6 months ago
0
AMD Support
#187
freelulul
closed
6 months ago
0
Will this work with Falcon 2?
#186
aaronrmm
opened
6 months ago
0
Why AXPY?
#185
richardweii
closed
6 months ago
2
关于在A100显卡上测得的效果异常的疑问
#184
bulaikexiansheng
opened
6 months ago
3
请问大神有支持LLama 3 70B 的计划吗?
#183
xiasw81
opened
6 months ago
0
在A100-80G上无法找到cuda的情况
#182
bulaikexiansheng
opened
7 months ago
2
Any plans to support llamafied Qwen1.5?有支持llama化qwen的计划吗?
#181
Ce-daros
closed
7 months ago
2
Next