oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
40.57k stars 5.31k forks source link

LLaMa wont load when using deepspeed #424

Closed catalpaaa closed 1 year ago

catalpaaa commented 1 year ago

Describe the bug

LLaMa wont load when using deepspeed, it will just get stucked on

Loading llama-7b...

and taking up all the ram before freezing the system.

Is there an existing issue for this?

Reproduction

fellow the tutorial for enable deepspeed, using llama 7b hf.

Screenshot

No response

Logs

only log shown is Loading llama-7b...

System Info

3080ti 12gb on ubuntu 22
catalpaaa commented 1 year ago

btw theres 64gb swap but looks like its not used.

catalpaaa commented 1 year ago

Also it looks like whenever the model is splite into mutilple bins then stuff just freeze. To me, deepspeed just doesnt wanna use the swap, it takes up all the buff/cache then dose nothing. image

cyrcule commented 1 year ago

Also it looks like whenever the model is splite into mutilple bins then stuff just freeze. To me, deepspeed just doesnt wanna use the swap, it takes up all the buff/cache then dose nothing. image

Have you found a way to force deepspeed to use swap?