Implement RESTful API of FlexGen

FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Apache License 2.0

9.2k stars 547 forks source link

Open Fyphen1223 opened 10 months ago

Fyphen1223 commented 10 months ago

I really want the RESTful API of this project. I wanna implement it by myself but it is nearly impossible for me :(