Error when try load models

I'm having trouble loading models into WebUI. Using the guides on the web, I did as it was written and after starting WebUi, two different errors pop up depending on the models.

TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g

"2023-08-02 18:31:17 ERROR:Failed to load the model.

Traceback (most recent call last):

File "F:\oobabooga_windows\text-generation-webui\server.py", line 68, in load_model_wrapper

shared.model, shared.tokenizer = load_model(shared.model_name, loader)

File "F:\oobabooga_windows\text-generation-webui\modules\models.py", line 78, in load_model

output = load_func_maploader

File "F:\oobabooga_windows\text-generation-webui\modules\models.py", line 287, in AutoGPTQ_loader

return modules.AutoGPTQ_loader.load_quantized(model_name)

File "F:\oobabooga_windows\text-generation-webui\modules\AutoGPTQ_loader.py", line 53, in load_quantized

model = AutoGPTQForCausalLM.from_quantized(path_to_model, **params)

File "F:\oobabooga_windows\installer_files\env\lib\site-packages\auto_gptq\modeling\auto.py", line 94, in from_quantized

return quant_func(

File "F:\oobabooga_windows\installer_files\env\lib\site-packages\auto_gptq\modeling_base.py", line 793, in from_quantized

accelerate.utils.modeling.load_checkpoint_in_model(

File "F:\oobabooga_windows\installer_files\env\lib\site-packages\accelerate\utils\modeling.py", line 1336, in load_checkpoint_in_model

set_module_tensor_to_device(

File "F:\oobabooga_windows\installer_files\env\lib\site-packages\accelerate\utils\modeling.py", line 298, in set_module_tensor_to_device

new_value = value.to(device)

RuntimeError: CUDA error: out of memory

CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Compile with TORCH_USE_CUDA_DSA to enable device-side assertions."

TheBloke_chronos-hermes-13B-GPTQ\chronos-hermes-13b-GPTQ-4bit-128g

"WARNING:The safetensors archive passed at models\TheBloke_chronos-hermes-13B-GPTQ\chronos-hermes-13b-GPTQ-4bit-128g.no-act.order.safetensors does not contain metadata. Make sure to save your model with the save_pretrained method. Defaulting to 'pt' metadata."

I use RTX 4070 12GB Vram , 16 GB Ram, Intel i5-12400F

What am I doing wrong? How to fix it?

oobabooga / one-click-installers

Error when try load models #114