issues
search
SJTU-IPADS
/
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MIT License
7.96k
stars
412
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
not enough space in the buffer with long prompts
#129
RachelShalom
opened
9 months ago
2
BenchMark 复现遇到了问题
#128
wahaha22
opened
9 months ago
9
Fix CUDA performance regression due to auto offloading
#127
hodlen
closed
9 months ago
0
how to show the outputs result on a web-service? or how can i get the result of inferrence for other application?
#126
xujiangyu
opened
9 months ago
3
期待支持maxtral 8X7b
#125
shmily91
closed
10 months ago
1
Would you please kindly offer the data, codes, or settings for training the predictor?
#124
Raincleared-Song
closed
10 months ago
1
./build/bin/main is not recognized as the name of a cmdlet, function, script file, or operable program
#123
onlyone-hyphen
closed
9 months ago
2
provided PTX was compiled with an unsupported toolchain
#122
thangld201
opened
10 months ago
1
Benchmarks
#121
hasanar1f
opened
10 months ago
2
where can I download the predictor of Relu-Falcon-40B (float16)?
#120
chenglimin
opened
10 months ago
7
No CUDA toolset found
#119
c469591
opened
10 months ago
5
Irrelevant replies to prompts - LLama or PowerInfer issue?
#118
bluusun
opened
10 months ago
1
what is the test condition of Figure 18 in your paper?
#117
chenglimin
opened
10 months ago
12
How to obtain 'predictor weights'?
#116
harikrishnaapc
opened
10 months ago
4
will it run on windows ?
#115
Sandy4321
closed
10 months ago
2
Support CPU/GPU inference on Windows
#114
hodlen
closed
10 months ago
0
Fix argument parsing in examples/batched
#113
hodlen
closed
10 months ago
0
Support broader python and pip versions
#112
hodlen
closed
10 months ago
0
Can the MLP-predictor predicts all the neurons?
#111
curiousNick1
opened
10 months ago
2
GPU is not used after model is loaded
#110
mio-19
opened
10 months ago
4
How can I get the same answer any time?
#109
sunnyregion
opened
10 months ago
1
CUDA error 13 at /home/PowerInfer/ggml-cuda.cu:9619: invalid device symbol
#108
zilunzhang
opened
10 months ago
2
Windows Visual Studio编译失败
#107
dyt06
closed
10 months ago
1
Support setting VRAM budget for `examples/server`
#106
hodlen
closed
10 months ago
0
谁能帮我解释一下推理结果的中文意思吗
#105
Gengchunsheng
opened
10 months ago
1
是否考虑支持codellama
#104
littlebai3618
opened
10 months ago
1
falcon40B模型推理结果不可读
#103
jqliu42
opened
10 months ago
1
01-ai的Yi模型系列可以适配吗,我看模型结构是跟llama一样的
#102
felixstander
closed
10 months ago
1
./build/bin/main -m /PATH/TO/MODEL -n $output_token_count -t $thread_num -p $prompt '.' 不是内部或外部命令,也不是可运行的程序
#101
18635191739
closed
10 months ago
2
对稠密激活Llama模型的兼容性问题 Compatibility issue with densely activated Llama models
#100
1562668477
opened
10 months ago
6
Fix generation error under INT4 quantization and batched prompting
#99
hodlen
closed
10 months ago
0
Further optimisation of hybrid inference
#98
hodlen
opened
10 months ago
0
Optimize CUDA sparse operator with Tensor Core
#97
hodlen
opened
10 months ago
0
Kernel fusion to reduce communication overhead
#96
hodlen
opened
10 months ago
0
Reclaim memory from offloaded model weights
#95
hodlen
opened
10 months ago
1
How to convert llama family model to powerinfer.gguf?
#94
Mokuroh0924
closed
10 months ago
1
Meta: Wider model support for PowerInfer
#93
hodlen
opened
10 months ago
10
Meta: Implementing hybrid inference across key desktop platforms
#92
hodlen
opened
10 months ago
0
我也遇到了类似的问题,找不到stdatomic.h,不过我是在linux平台
#91
yinghuo302
closed
10 months ago
1
Update issue templates of PowerInfer
#90
hodlen
closed
10 months ago
0
Add our Kanban to README.md
#89
hodlen
closed
10 months ago
0
macOS/Metal inference support
#88
hodlen
opened
10 months ago
0
WSL + CUDA issues
#87
hodlen
opened
10 months ago
0
Windows CPU/GPU support
#86
hodlen
closed
10 months ago
2
Fix offloading / VRAM budget bugs
#85
hodlen
opened
10 months ago
2
请问original weight, predictor weights是怎么生成的?
#84
sunnyregion
opened
10 months ago
2
Can we make it run on other models?
#83
YLSnowy
opened
10 months ago
6
Converting GGUF Models and Support for Smaller Models
#82
nndnnv
opened
10 months ago
1
didn't use gpu
#81
yuxx0218
closed
10 months ago
4
cmake -S . -B build -DLLAMA_CUBLAS=ON
#80
hungptit123
opened
10 months ago
1
Previous
Next