Bug using SAC with torch version 1.8.0a0+963f762

I run the SAC code with torch (compiled version) while i encounter the error

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 1]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

how can i fix it?

The whole error is listed below

Logging to logs//HalfCheetah-v2/
Initial exploration has been finished!
Traceback (most recent call last):
  File "train.py", line 14, in <module>
    sac_trainer.learn()
  File "/home/reinforcement-learning-algorithms/rl_algorithms/sac/sac_agent.py", line 97, in learn
    qf1_loss, qf2_loss, actor_loss, alpha, alpha_loss = self._update_newtork()
  File "/home/reinforcement-learning-algorithms/rl_algorithms/sac/sac_agent.py", line 189, in _update_newtork
    actor_loss.backward()
  File "/home/admin/anaconda3/envs/pytorch_build/lib/python3.8/site-packages/torch/tensor.py", line 221, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/admin/anaconda3/envs/pytorch_build/lib/python3.8/site-packages/torch/autograd/__init__.py", line 130, in backward
    Variable._execution_engine.run_backward(

this is how i run the code

python train.py --env-name HalfCheetah-v2 --cuda --seed 1

TianhongDai / reinforcement-learning-algorithms

Bug using SAC with torch version 1.8.0a0+963f762 #8