FMInference / FlexiGen

Running large language models on a single GPU for throughput-oriented scenarios.
Apache License 2.0
9.14k stars 541 forks source link

Just a suggestion: Think about what Automatic1111 did to Stable Diffusion #34

Open cmp-nct opened 1 year ago

cmp-nct commented 1 year ago

Think about what Automatic1111 did to Stable Diffusion, from a rather brute one-shot image generator significantly worse than the commercial counterparts it is now a distribution with thousands of features, hundreds of extensions, visual gradio support and even an API. It's development pace is sometimes almost impossible to watch, the model performance increased at least 2 times from the original start while consuming a fraction of the original GPU memory and even going beyond the capabilities of the model.

The core reason why this worked out was the local and automated installation process, it simply works on almost any system. All you need is to pull the GIT image and anything needed is downloaded/installed. No need to fight with dependencies, etc.

The second reason is the highly active team of devs that allowed integration of the thousands of contributions (and of course the few core devs).

FlexGen looks to me like it could have the potential to shake up the industry similar. But to draw in people and devs it needs to start with the accessibility. Troublefree installing and gradio or similar web interface.

oobabooga commented 1 year ago

Have fun (FlexGen support has been added today)

https://github.com/oobabooga/text-generation-webui

tensiondriven commented 1 year ago

And there's always https://github.com/KoboldAI/KoboldAI-Client/ which probably won't get FlexGen support for some time.