johnsmith0031 / alpaca_lora_4bit

MIT License
534 stars 84 forks source link

AttributeError: module 'gptq_llama.quant_cuda' has no attribute 'vecquant4recons_v1' #51

Closed WUHU-G closed 1 year ago

WUHU-G commented 1 year ago

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/platform/anaconda3/envs/hcs did not contain libcudart.so as expected! Searching further paths... warn(msg) CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.2/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.0 CUDA SETUP: Detected CUDA version 112 CUDA SETUP: Loading binary /home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_cuda112.so...

Parameters: -------config------- dataset='./dataset.json' ds_type='alpaca' lora_out_dir='alpaca_lora' lora_apply_dir=None llama_q4_config_dir='./llama-13b-4bit/' llama_q4_model='./llama-13b-4bit/llama-13b-4bit.pt'

------training------ mbatch_size=1 batch_size=2 gradient_accumulation_steps=2 epochs=3 lr=0.0002 cutoff_len=256 lora_r=8 lora_alpha=16 lora_dropout=0.05 val_set_size=0.2 gradient_checkpointing=False gradient_checkpointing_ratio=1 warmup_steps=50 save_steps=50 save_total_limit=3 logging_steps=10 checkpoint=False skip=False world_size=1 ddp=False device_map='auto'

Loading Model ... normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization. Loaded the model in 15.77 seconds. Fitting 4bit scales and zeros to half Downloading and preparing dataset json/default to /home/platform/.cache/huggingface/datasets/json/default-0d98d378279da9bb/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51... Downloading data files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 12192.74it/s] Extracting data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2252.58it/s] Dataset json downloaded and prepared to /home/platform/.cache/huggingface/datasets/json/default-0d98d378279da9bb/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51. Subsequent calls will reuse this data. 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 981.81it/s] /home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning warnings.warn( 0%| | 0/24 [00:00<?, ?it/s]Traceback (most recent call last): File "/home/platform/huangchensen/llama_qunt/finetune.py", line 147, in trainer.train() File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/trainer.py", line 1639, in train return inner_training_loop( ^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/trainer.py", line 1906, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/trainer.py", line 2652, in training_step loss = self.compute_loss(model, inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/trainer.py", line 2684, in compute_loss outputs = model(inputs) ^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/peft-0.3.0.dev0-py3.11.egg/peft/peft_model.py", line 529, in forward File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/models/llama/modeling_llama.py", line 687, in forward outputs = self.model( ^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/models/llama/modeling_llama.py", line 577, in forward layer_outputs = decoder_layer( ^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/models/llama/modeling_llama.py", line 292, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( ^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/models/llama/modeling_llama.py", line 196, in forward query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/peft-0.3.0.dev0-py3.11.egg/peft/tuners/lora.py", line 686, in forward File "/home/platform/huangchensen/llama_qunt/autograd_4bit.py", line 57, in forward out = AutogradMatmul4bit.apply(x, self.qweight, self.scales, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/autograd/function.py", line 506, in apply return super().apply(args, **kwargs) # type: ignore[misc] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/huangchensen/llama_qunt/autograd_4bit.py", line 14, in forward output = mm4b._matmul4bit_v1_recons(x, qweight, scales, zeros) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/platform/huangchensen/llama_qunt/matmul_utils_4bit.py", line 79, in _matmul4bit_v1_recons quant_cuda.vecquant4recons_v1(qweight, buffer, scales, zeros) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: module 'gptq_llama.quant_cuda' has no attribute 'vecquant4recons_v1' 0%|

johnsmith0031 commented 1 year ago

Use this branch. https://github.com/sterlind/GPTQ-for-LLaMa/tree/lora_4bit

pip uninstall gptq_llama
pip install git+https://github.com/sterlind/GPTQ-for-LLaMa.git@lora_4bit
WUHU-G commented 1 year ago

Use this branch. https://github.com/sterlind/GPTQ-for-LLaMa/tree/lora_4bit

pip uninstall gptq_llama
pip install git+https://github.com/sterlind/GPTQ-for-LLaMa.git@lora_4bit

thanks, it works

turboderp commented 1 year ago

In case anyone else has this error and this doesn't work, try adding --force:

pip uninstall gptq_llama
pip install --force git+https://github.com/sterlind/GPTQ-for-LLaMa.git@lora_4bit

The issue in my case was that I'd had older versions of GPTQ installed, and for some reason components were left behind which wouldn't get removed by pip uninstall gptq_llama but still caused pip to think parts of the new package were already installed.