Xingyu-Lin / mbpo_pytorch

A pytorch reprelication of the model-based reinforcement learning algorithm MBPO
150 stars 38 forks source link

Error when trying to run #6

Open BY571 opened 2 years ago

BY571 commented 2 years ago

Hey, thank you for your work but sadly I'm not able to run your code I'm getting this inplace operation error. Weirdly that this only happens to me, I was just cloning the repo and running your example command.

File "mbpo.py", line 267, in
main() File "mbpo.py", line 263, in main train(args, env_sampler, predict_env, agent, env_pool, model_pool) File "mbpo.py", line 124, in train train_policy_steps += train_policy_repeats(args, total_step, train_policy_steps, cur_step, env_pool, model_pool, agent) File "mbpo.py", line 220, in train_policy_repeats agent.update_parameters((batch_state, batch_action, batch_reward, batch_next_state, batch_done), args.policy_train_batch_size, i ) File "/shared/sebastian/replication-mbpo/sac/sac.py", line 89, in update_parameters policy_loss.backward() File "/shared/sebastian/miniconda3/envs/rrc_simulation/lib/python3.6/site-packages/torch/tensor.py", line 221, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/shared/sebastian/miniconda3/envs/rrc_simulation/lib/python3.6/site-packages/torch/autograd/init.py", line 132, in backw ard allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTenso r [256, 1]], which is output 0 of TBackward, is at version 3; expected version 2 instead. Hint: the backtrace further above shows th e operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!