johnsmith0031 / alpaca_lora_4bit

MIT License
533 stars 84 forks source link

run_server.sh: ValueError: Autograd4bitQuantLinear() does not have a parameter or a buffer named g_idx. #98

Open yfliao opened 1 year ago

yfliao commented 1 year ago

How to fix the following problem?

"ValueError: Autograd4bitQuantLinear() does not have a parameter or a buffer named g_idx."

script

❯ cat ./run_server.sh
#!/bin/bash

export PYTHONPATH=$PYTHONPATH:./:./text-generation-webui

CONFIG_PATH=./llama-7b-4bit-v2/config.json
MODEL_PATH=./llama-7b-4bit-v2/llama-7b-4bit-ts-ao-g128-v2.safetensors
LORA_PATH=./alpaca_lora/adapter_model.bin

#VENV_PATH=
#source $VENV_PATH/bin/activate
python ./scripts/run_server.py --config_path $CONFIG_PATH --model_path $MODEL_PATH --lora_path $LORA_PATH --groupsize=128 --quant_attn --port 5555 --pub_port 5556

log

❯ ./run_server.sh
Loading ./llama-7b-4bit-v2/llama-7b-4bit-ts-ao-g128-v2.safetensors ...
Loading Model ...
/home/nyculiao/anaconda3/envs/pytorch/lib/python3.9/site-packages/accelerate/utils/modeling.py:779: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  with safe_open(checkpoint_file, framework="pt") as f:
/home/nyculiao/anaconda3/envs/pytorch/lib/python3.9/site-packages/torch/_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
/home/nyculiao/anaconda3/envs/pytorch/lib/python3.9/site-packages/torch/storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  storage = cls(wrap_storage=untyped_storage)
The safetensors archive passed at ./llama-7b-4bit-v2/llama-7b-4bit-ts-ao-g128-v2.safetensors does not contain metadata. Make sure to save your model with the `save_pretrained` method. Defaulting to 'pt' metadata.
/home/nyculiao/anaconda3/envs/pytorch/lib/python3.9/site-packages/accelerate/utils/modeling.py:820: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  with safe_open(checkpoint_file, framework="pt", device=device) as f:
Traceback (most recent call last):
  File "/home/nyculiao/liao/alpaca_lora_4bit/./scripts/run_server.py", line 26, in <module>
    server.run()
  File "/home/nyculiao/liao/alpaca_lora_4bit/model_server/server.py", line 147, in run
    self.load_model()
  File "/home/nyculiao/liao/alpaca_lora_4bit/model_server/server.py", line 79, in load_model
    model, tokenizer = load_llama_model_4bit_low_ram(self.config_path, self.model_path, groupsize=self.groupsize, is_v1_model=self.is_v1_model)
  File "/home/nyculiao/liao/alpaca_lora_4bit/autograd_4bit.py", line 204, in load_llama_model_4bit_low_ram
    model = accelerate.load_checkpoint_and_dispatch(
  File "/home/nyculiao/anaconda3/envs/pytorch/lib/python3.9/site-packages/accelerate/big_modeling.py", line 479, in load_checkpoint_and_dispatch
    load_checkpoint_in_model(
  File "/home/nyculiao/anaconda3/envs/pytorch/lib/python3.9/site-packages/accelerate/utils/modeling.py", line 946, in load_checkpoint_in_model
    set_module_tensor_to_device(model, param_name, param_device, value=param, dtype=dtype)
  File "/home/nyculiao/anaconda3/envs/pytorch/lib/python3.9/site-packages/accelerate/utils/modeling.py", line 131, in set_module_tensor_to_device
    raise ValueError(f"{module} does not have a parameter or a buffer named {tensor_name}.")
ValueError: Autograd4bitQuantLinear() does not have a parameter or a buffer named g_idx.
johnsmith0031 commented 1 year ago

Try cloning the latest code in a new folder? Maybe you have multiple reference to autograd_4bit or something else