Runtime Error when running fine tuning script.

Hi,

I have tried run alpaca_finetuning_v1/finetuning.sh and encounter runtime error.

Traceback (most recent call last):
  File "finetuning.py", line 294, in <module>
    main(args)
  File "finetuning.py", line 253, in main
    train_stats = train_one_epoch(
  File "/home/LLaMA-Adapter/alpaca_finetuning_v1/engine_finetuning.py", line 50, in train_one_epoch
    loss /= accum_iter
RuntimeError: Output 0 of _DDPSinkBackward is a view and is being modified inplace. This view was created inside a custom Func
tion (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward
 associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning
the output of the custom Function.

I tried cloning the loss by adding loss = loss.clone() before calling loss /= accum_iter, and the script is working. However, I am not sure whether this will affect the backward process (or the training) or not. Besides, any suggestion to avoid this runtime error?

My environment:

GPU = NVIDIA Tesla V100 SXM3 32 GB
CUDA Version = 11.1
torch version = 1.10.1+cu111

Thank you

OpenGVLab / LLaMA-Adapter

Runtime Error when running fine tuning script. #41