Closed kosta-jo closed 3 years ago
pfrl.agents.DDPG
expects as a policy an nn.Module
that returns torch.distributions.Distribution
, not a Tensor
. Here is the definition of the policy in examples/mujoco/reproduction/ddpg/train_ddpg.py
, where DeterministicHead
returns a Distribution
, not a Tensor
.
https://github.com/pfnet/pfrl/blob/70f3da9163c047fa6981047e74836f1e138694e0/examples/mujoco/reproduction/ddpg/train_ddpg.py#L134-L142
This may seem strange, but it allows users to use a reparameterized stochastic policy like in soft actor-critic or stochastic value gradient algorithms.
@kosta-jo how did you solve this error. I am getting the same while implementing PPO. AttributeError: 'Tensor' object has no attribute 'sample'
@ankush-ojha The answer from muupan solves it. You need to place Distribution
layer at the end of the policy network. If you want deterministic policy just put DeterministicHead
layer
https://github.com/pfnet/pfrl/blob/70f3da9163c047fa6981047e74836f1e138694e0/pfrl/agents/ddpg.py#L258-L262 Here,
self.policy
is of type nn.Module, so callingself.policy(batch_xs)
returns a pytorch Tensor, which does not have attribute sample and i get the error. What is wrong with this?