issues
search
FMInference
/
FlexLLMGen
Running large language models on a single GPU for throughput-oriented scenarios.
Apache License 2.0
9.18k
stars
548
forks
source link
Update README.md
#31
Closed
zhangce
closed
1 year ago
zhangce
commented
1 year ago
Add in limitation. Make it clear it is for throughput
Add in limitation. Make it clear it is for throughput