YouAreSpecialToMe / QST

Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
Apache License 2.0
32 stars 1 forks source link

ERROR during training #2

Open syj2908 opened 1 week ago

syj2908 commented 1 week ago

Hi, I encountered this error during training llama-2-7B using the script, any idea to fix it?

Traceback (most recent call last): File "qst.py", line 942, in train() File "qst.py", line 903, in train train_result = trainer.train() File "/home/anaconda3/lib/python3.8/site-packages/transformers/trainer.py", line 1553, in train return inner_training_loop( File "/home/anaconda3/lib/python3.8/site-packages/transformers/trainer.py", line 1835, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/anaconda3/lib/python3.8/site-packages/transformers/trainer.py", line 2690, in training_step self.accelerator.backward(loss) File "/home/anaconda3/lib/python3.8/site-packages/accelerate/accelerator.py", line 1923, in backward loss.backward(**kwargs) File "/home/anaconda3/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward torch.autograd.backward( File "/home/anaconda3/lib/python3.8/site-packages/torch/autograd/init.py", line 200, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

geniedan commented 1 week ago

please try hf model (e.g., Llama-2-7b-hf and Llama-2-3b-hf)

syj2908 commented 1 week ago

please try hf model (e.g., Llama-2-7b-hf and Llama-2-3b-hf)

thanks, i'll try it later. For llama-2-7b, I add model.enable_input_require_grads() after model, tokenizer = get_accelerate_model(args, checkpoint_dir) in train() function and it works.