Stability-AI / StableLM

StableLM: Stability AI Language Models
Apache License 2.0
15.85k stars 1.04k forks source link

Is it normal to take a long time ( about 15min )to generate an answer? #78

Closed nicezic closed 1 year ago

nicezic commented 1 year ago

I use GTX3070Ti 8G VRAM, and Ryzen 32Core.

Is it normal to take a long time ( about 15min )to generate an answer?

My params are ..

model_name = "stabilityai/stablelm-tuned-alpha-7b" 
torch_dtype = "bfloat16" #@param ["float16", "bfloat16", "float"]
load_in_8bit = False #@param {type:"boolean"}
device_map = "auto"

Is there a way to speed up to generation?

mcmonkey4eva commented 1 year ago

you've configured to load in 16 bit, but you have only half the amount of VRAM needed to load it in the first place. It's likely running on CPU, or offloading to system ram.

When using the reference software, set load_in_8bit = True to have better odds of loading properly, or use prebuilt user-ready software like https://github.com/oobabooga/text-generation-webui (this can load in 4-bit which will fit find on your GPU)