I have tried run alpaca_finetuning_v1/finetuning.sh and encounter runtime error.
Traceback (most recent call last):
File "finetuning.py", line 294, in <module>
main(args)
File "finetuning.py", line 253, in main
train_stats = train_one_epoch(
File "/home/LLaMA-Adapter/alpaca_finetuning_v1/engine_finetuning.py", line 50, in train_one_epoch
loss /= accum_iter
RuntimeError: Output 0 of _DDPSinkBackward is a view and is being modified inplace. This view was created inside a custom Func
tion (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward
associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning
the output of the custom Function.
I tried cloning the loss by adding loss = loss.clone() before calling loss /= accum_iter, and the script is working. However, I am not sure whether this will affect the backward process (or the training) or not. Besides, any suggestion to avoid this runtime error?
My environment:
GPU = NVIDIA Tesla V100 SXM3 32 GB
CUDA Version = 11.1
torch version = 1.10.1+cu111
Hi,
I have tried run alpaca_finetuning_v1/finetuning.sh and encounter runtime error.
I tried cloning the loss by adding
loss = loss.clone()
before callingloss /= accum_iter
, and the script is working. However, I am not sure whether this will affect the backward process (or the training) or not. Besides, any suggestion to avoid this runtime error?My environment:
Thank you