SJTU-IPADS PowerInfer issues

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

MIT License

7.97k stars 415 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

注意需要修改llama.cpp中的激活函数，从Gelu改为Relu

#230 eraser333 closed 3 weeks ago
0
Error: the provided PTX was compiled with an unsupported toolchain

#229 jiangzizi opened 3 weeks ago
0
about the use of OPT model

#228 bobzhang208 opened 1 month ago
0
add new model in power-infer2

#227 Francis235 opened 1 month ago
1
Qualcomm chips support

#226 Francis235 opened 1 month ago
0
Question about the perplexity

#225 eljrte opened 1 month ago
0
关于注意力块权重如何分配？

#224 Yues007 opened 1 month ago
0
请问我该如何获得opt模型相关的weight文件？

#223 a1bc2def6g opened 2 months ago
1
What does "co-activation" mean in Section 4.3 of the PowerInfer-2 paper?

#222 exhyy closed 1 month ago
0
关于README视频Demo的问题

#221 lyeXzot closed 1 month ago
0
统计predictor的overhead

#220 guanchenl opened 2 months ago
0
Help! Want a toy example to run matmul with q40 weight by cuda kernel

#219 Eutenacity opened 2 months ago
0
CUDA toolkit version?

#218 shujiehan opened 2 months ago
1
Fix segmentation fault for models exceeding 40B on AMD GPUs & optimize mul_mat_axpy operation

#217 Tworan closed 2 months ago
0
Am i doing something wrong?

#216 RealMrCactus opened 2 months ago
1
有微信或QQ或其他交流群或者打算开一个吗？

#215 lzcchl opened 3 months ago
0
.generated.gpuidx 是在用 huggingface-cli 命令下载模型的时候自动生成的吗？有没用别的办法获取？

#214 lzcchl closed 3 months ago
1
Some question about Fig4.

#213 rhmaaa opened 4 months ago
5
Support converting TurboSparse mistral model into PowerInfer GGUF

#212 hodlen opened 4 months ago
0
我要如何获得预测文件呢

#211 LDLINGLINGLING opened 4 months ago
1
Feature request : Support for PHI3 mini

#210 raymond-infinitecode opened 4 months ago
0
请问powerinfer能否兼容llama.cpp的模型呢

#209 mailonghua opened 4 months ago
0
the output for Q4_gguf is strange again!!

#208 milktea888 opened 4 months ago
1
About powerinfer-2

#207 Ther-nullptr opened 4 months ago
0
Can Powerinfer run on CPUONLY?

#206 0wwafa closed 4 months ago
1
add convert-hf-to-powerinfer-gguf.py to CMakeLists.txt

#205 MatthewCroughan closed 4 months ago
1
minor for readme

#204 YixinSong-e closed 5 months ago
0
Where is the TurboSparse-Mixtral mlp_predictor?

#203 MatthewCroughan opened 5 months ago
1
请问能和vllm共同使用吗

#202 yadandan opened 5 months ago
0
How to convert ProSparse-LLaMA-2-13B model to .gguf?

#201 Graysonicc opened 5 months ago
3
代码中使用的llama.cpp版本

#200 weizhenhuan closed 5 months ago
0
windows下cmake编译失败

#199 codetown opened 5 months ago
0
为什么执行过程中没有gpu offload，这边执行的gpu为a100 80G，

#198 qw1319 closed 5 months ago
2
How can i convert llama-3 8b and 70b in GGUF model ?

#197 tankvpython opened 5 months ago
0
支持的量化类型

#196 deleteeeee closed 1 month ago
1
How to calculate the number of activated params in TurboSparse paper?

#195 ustcwhy closed 5 months ago
2
Source for v2 (mobile inference engine)

#194 peeteeman opened 5 months ago
7
minor in README

#193 YixinSong-e closed 5 months ago
0
add news

#192 YixinSong-e closed 5 months ago
0
ggml-cuda.cu:8949: invalid argument无效参数问题

#191 NeverGpDzy closed 5 months ago
2
推理报错

#190 deleteeeee closed 5 months ago
0
ReluFalcon 40B 在llama.cpp上无效输出

#189 Zctoylm0927 closed 5 months ago
4
Need quite a long time to load the model

#188 meicale opened 6 months ago
0
AMD Support

#187 freelulul closed 6 months ago
0
Will this work with Falcon 2?

#186 aaronrmm opened 6 months ago
0
Why AXPY?

#185 richardweii closed 6 months ago
2
关于在A100显卡上测得的效果异常的疑问

#184 bulaikexiansheng opened 6 months ago
3
请问大神有支持LLama 3 70B 的计划吗？

#183 xiasw81 opened 6 months ago
0
在A100-80G上无法找到cuda的情况

#182 bulaikexiansheng opened 7 months ago
2
Any plans to support llamafied Qwen1.5？有支持llama化qwen的计划吗？

#181 Ce-daros closed 7 months ago
2