FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.
Apache License 2.0
9.2k stars 547 forks source link

When will the optimizer for determining offload strategy be released? #107

Open frankxyy opened 1 year ago

frankxyy commented 1 year ago

Currently, the parameters for the offload strategy is passed from the command line by the user. When will the optimizer be released for users to search for the parameters?