Closed WUHU-G closed 1 year ago
Use this branch. https://github.com/sterlind/GPTQ-for-LLaMa/tree/lora_4bit
pip uninstall gptq_llama
pip install git+https://github.com/sterlind/GPTQ-for-LLaMa.git@lora_4bit
Use this branch. https://github.com/sterlind/GPTQ-for-LLaMa/tree/lora_4bit
pip uninstall gptq_llama pip install git+https://github.com/sterlind/GPTQ-for-LLaMa.git@lora_4bit
thanks, it works
In case anyone else has this error and this doesn't work, try adding --force:
pip uninstall gptq_llama
pip install --force git+https://github.com/sterlind/GPTQ-for-LLaMa.git@lora_4bit
The issue in my case was that I'd had older versions of GPTQ installed, and for some reason components were left behind which wouldn't get removed by pip uninstall gptq_llama
but still caused pip to think parts of the new package were already installed.
===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/platform/anaconda3/envs/hcs did not contain libcudart.so as expected! Searching further paths... warn(msg) CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.2/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.0 CUDA SETUP: Detected CUDA version 112 CUDA SETUP: Loading binary /home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_cuda112.so...
Parameters: -------config------- dataset='./dataset.json' ds_type='alpaca' lora_out_dir='alpaca_lora' lora_apply_dir=None llama_q4_config_dir='./llama-13b-4bit/' llama_q4_model='./llama-13b-4bit/llama-13b-4bit.pt'
------training------ mbatch_size=1 batch_size=2 gradient_accumulation_steps=2 epochs=3 lr=0.0002 cutoff_len=256 lora_r=8 lora_alpha=16 lora_dropout=0.05 val_set_size=0.2 gradient_checkpointing=False gradient_checkpointing_ratio=1 warmup_steps=50 save_steps=50 save_total_limit=3 logging_steps=10 checkpoint=False skip=False world_size=1 ddp=False device_map='auto'
Loading Model ... normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization. Loaded the model in 15.77 seconds. Fitting 4bit scales and zeros to half Downloading and preparing dataset json/default to /home/platform/.cache/huggingface/datasets/json/default-0d98d378279da9bb/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51... Downloading data files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 12192.74it/s] Extracting data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2252.58it/s] Dataset json downloaded and prepared to /home/platform/.cache/huggingface/datasets/json/default-0d98d378279da9bb/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51. Subsequent calls will reuse this data. 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 981.81it/s] /home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set
trainer.train()
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/trainer.py", line 1639, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/trainer.py", line 1906, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/trainer.py", line 2652, in training_step
loss = self.compute_loss(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/trainer.py", line 2684, in compute_loss
outputs = model(inputs)
^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/peft-0.3.0.dev0-py3.11.egg/peft/peft_model.py", line 529, in forward
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/models/llama/modeling_llama.py", line 687, in forward
outputs = self.model(
^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/models/llama/modeling_llama.py", line 577, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/models/llama/modeling_llama.py", line 292, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/transformers-4.28.0.dev0-py3.11.egg/transformers/models/llama/modeling_llama.py", line 196, in forward
query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/peft-0.3.0.dev0-py3.11.egg/peft/tuners/lora.py", line 686, in forward
File "/home/platform/huangchensen/llama_qunt/autograd_4bit.py", line 57, in forward
out = AutogradMatmul4bit.apply(x, self.qweight, self.scales,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/anaconda3/envs/hcs/lib/python3.11/site-packages/torch/autograd/function.py", line 506, in apply
return super().apply(args, **kwargs) # type: ignore[misc]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/huangchensen/llama_qunt/autograd_4bit.py", line 14, in forward
output = mm4b._matmul4bit_v1_recons(x, qweight, scales, zeros)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/platform/huangchensen/llama_qunt/matmul_utils_4bit.py", line 79, in _matmul4bit_v1_recons
quant_cuda.vecquant4recons_v1(qweight, buffer, scales, zeros)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'gptq_llama.quant_cuda' has no attribute 'vecquant4recons_v1'
0%|
no_deprecation_warning=True
to disable this warning warnings.warn( 0%| | 0/24 [00:00<?, ?it/s]Traceback (most recent call last): File "/home/platform/huangchensen/llama_qunt/finetune.py", line 147, in