nikhilbarhate99 / PPO-PyTorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
MIT License
1.66k stars 343 forks source link

in cuda train error expected dtype Double but got dtype Float #33

Closed fatalfeel closed 4 years ago

fatalfeel commented 4 years ago

0.002 (0.9, 0.999) Episode 20 avg length: 85 reward: -228 Traceback (most recent call last): File "/mnt/projects/PPO-PyTorch/PPO.py", line 179, in ppo.update(memory) File "/mnt/projects/PPO-PyTorch/PPO.py", line 122, in update loss.mean().backward() File "/usr/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/usr/lib/python3.7/site-packages/torch/autograd/init.py", line 100, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: expected dtype Double but got dtype Float (validate_dtype at /pytorch/aten/src/ATen/native/TensorIterator.cpp:143) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7f3ece1e3536 in /usr/lib/python3.7/site-packages/torch/lib/libc10.so) frame #1: at::TensorIterator::compute_types() + 0xce3 (0x7f3ea23c2a23 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #2: at::TensorIterator::build() + 0x44 (0x7f3ea23c5404 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #3: at::native::mse_loss_backward_out(at::Tensor&, at::Tensor const&, at::Tensor const&, at::Tensor const&, long) + 0x193 (0x7f3ea2212953 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #4: + 0xf903d7 (0x7f3e65d073d7 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so) frame #5: at::native::mse_loss_backward(at::Tensor const&, at::Tensor const&, at::Tensor const&, long) + 0x172 (0x7f3ea221b092 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #6: + 0xf9068f (0x7f3e65d0768f in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so) frame #7: + 0x10c2536 (0x7f3ea264b536 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #8: + 0x2a9ecdb (0x7f3ea4027cdb in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #9: + 0x10c2536 (0x7f3ea264b536 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #10: torch::autograd::generated::MseLossBackward::apply(std::vector<at::Tensor, std::allocator >&&) + 0x1f7 (0x7f3ea3e2f777 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #11: + 0x2d89705 (0x7f3ea4312705 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #12: torch::autograd::Engine::evaluate_function(std::shared_ptr&, torch::autograd::Node*, torch::autograd::InputBuffer&) + 0x16f3 (0x7f3ea430fa03 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #13: torch::autograd::Engine::thread_main(std::shared_ptr const&, bool) + 0x3d2 (0x7f3ea43107e2 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #14: torch::autograd::Engine::thread_init(int) + 0x39 (0x7f3ea4308e59 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so) frame #15: torch::autograd::python::PythonEngine::thread_init(int) + 0x38 (0x7f3ed6fa6ac8 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #16: + 0xc70f (0x7f3ed668770f in /usr/lib/python3.7/site-packages/torch/lib/libtorch.so) frame #17: + 0x76ba (0x7f3edf5e06ba in /lib/x86_64-linux-gnu/libpthread.so.0) frame #18: clone + 0x6d (0x7f3edf31641d in /lib/x86_64-linux-gnu/libc.so.6)

Process finished with exit code 1

fatalfeel commented 4 years ago

fixed use this rewards = torch.tensor(rewards, dtype=torch.float32).to(device)