issues
search
SJTU-IPADS
/
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MIT License
7.9k
stars
406
forks
source link
Optimize `mul_mat_sparse` for INT4 quantized weights
#174
Closed
hodlen
closed
6 months ago