SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MIT License
7.96k stars 412 forks source link

Further optimisation of hybrid inference #98

Open hodlen opened 10 months ago

hodlen commented 10 months ago

Related issues/proposals: