SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MIT License
7.98k stars 415 forks source link

Feature request : Support for PHI3 mini #210

Open raymond-infinitecode opened 4 months ago

raymond-infinitecode commented 4 months ago

Prerequisites

Before submitting your issue, please ensure the following:

Feature Description

PHI3 mini is currently the most powerful SLM yet, but can we relu it to make it fast so a single Xeon server can serve hundreds of concurrent users with relu implementation ?

Motivation

Please provide a detailed written description of reasons why this feature is necessary and how it is useful to PowerInfer users.

Possible Implementation

Convert the Phi3 model to relu model