johnsmith0031 / alpaca_lora_4bit

MIT License
533 stars 84 forks source link

latest changes not loading model #70

Closed winglian closed 1 year ago

winglian commented 1 year ago
Traceback (most recent call last):
  File "finetune.py", line 55, in <module>
    model, tokenizer = load_llama_model_4bit_low_ram(ft_config.llama_q4_config_dir,
  File "/workspace/alpaca_lora_4bit/autograd_4bit.py", line 202, in load_llama_model_4bit_low_ram
    model = accelerate.load_checkpoint_and_dispatch(
  File "/opt/conda/lib/python3.8/site-packages/accelerate/big_modeling.py", line 479, in load_checkpoint_and_dispatch
    load_checkpoint_in_model(
  File "/opt/conda/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 946, in load_checkpoint_in_model
    set_module_tensor_to_device(model, param_name, param_device, value=param, dtype=dtype)
  File "/opt/conda/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 131, in set_module_tensor_to_device
    raise ValueError(f"{module} does not have a parameter or a buffer named {tensor_name}.")
ValueError: Autograd4bitQuantLinear() does not have a parameter or a buffer named zeros.
Ph0rk0z commented 1 year ago

Did you update the kernel fromthe other repo? https://github.com/sterlind/GPTQ-for-LLaMa/tree/lora_4bit

winglian commented 1 year ago

sorry, didn't see the new args were added to replace the env var

Ph0rk0z commented 1 year ago

Yea.. it got me too at first.