lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.28k stars 4.46k forks source link

The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format. #1532

Open junxin367 opened 1 year ago

junxin367 commented 1 year ago

An error occurs when vicuna-13B-1.1 is executed

Here is the video card in my environment image

Scenario 1: There are 2 graphics cards with 32GB memory, and 26GB is sufficient for running the 13B model. However, due to the line of code in model_adapter.py at line 112 that sets the GPU memory usage to 85%, which results in only about 25GB being available, causing an out-of-memory error.

Scenario 2: Using the --max-gpu-memory 20GiB flag should work, but the program loads 12.5GB of memory on the first card and the remaining on the second card, resulting in an error. From line 116 of model_adapter.py, we can see that the maximum memory usage is set to 20GB for both cards, and the program assumes it is enough to split the resources equally.

Solution: Manually modify the threshold value in line 112 of the code to be 0.9. Modify line 116 of the code and add logic to calculate the actual GPU memory size based on available_gpu_memory.

Scenario 3: After modifying line 116 of the code in accordance with Solution 2, an error occurs when kwargs["device_map"] = "auto" because the kwargs["max_memory"] values are the same. The full error message is: "The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers the weights in this format".

Modified code: image

Memory usage after modification: image

Bluemist76 commented 1 year ago

what file do we edit for this

Orad commented 1 year ago

This issue is mentioned here: https://github.com/nomic-ai/gpt4all/issues/239 We need to update the call to AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained( checkpoint, device_map="auto", offload_folder="offload", torch_dtype=torch.float16 )

allenhaozi commented 1 year ago

met same error on falcon-40b

ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format.
tareeesh2001 commented 7 months ago

met same error on falcon-40b

ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format.

Did you happen to solve this error??