issues
search
FMInference
/
FlexLLMGen
Running large language models on a single GPU for throughput-oriented scenarios.
Apache License 2.0
9.22k
stars
549
forks
source link
Support opt-iml
#10
Closed
Ying1123
closed
1 year ago