Open syj2908 opened 1 week ago
please try hf model (e.g., Llama-2-7b-hf and Llama-2-3b-hf)
please try hf model (e.g., Llama-2-7b-hf and Llama-2-3b-hf)
thanks, i'll try it later. For llama-2-7b, I add model.enable_input_require_grads()
after model, tokenizer = get_accelerate_model(args, checkpoint_dir)
in train()
function and it works.
Hi, I encountered this error during training llama-2-7B using the script, any idea to fix it?
Traceback (most recent call last): File "qst.py", line 942, in
train()
File "qst.py", line 903, in train
train_result = trainer.train()
File "/home/anaconda3/lib/python3.8/site-packages/transformers/trainer.py", line 1553, in train
return inner_training_loop(
File "/home/anaconda3/lib/python3.8/site-packages/transformers/trainer.py", line 1835, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/anaconda3/lib/python3.8/site-packages/transformers/trainer.py", line 2690, in training_step
self.accelerator.backward(loss)
File "/home/anaconda3/lib/python3.8/site-packages/accelerate/accelerator.py", line 1923, in backward
loss.backward(**kwargs)
File "/home/anaconda3/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/home/anaconda3/lib/python3.8/site-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn