Open Lagrant opened 2 years ago
It seems you used pytorch 1.0.1 in your project. Pytorch has added inplace detection since 1.5 which makes your code
def get_action(self, obs, deterministic=False): ''' sample action from the policy, conditioned on the task embedding ''' z = self.z obs = ptu.from_numpy(obs[None]) in_ = torch.cat([obs, z], dim=1) return self.policy.get_action(in_, deterministic=deterministic)
at https://github.com/katerakelly/oyster/blob/master/rlkit/torch/sac/agent.py throw an inplace operation error since the latent vector z is not detached from the action generation. It will cause policy loss function update z multiple times in one optimization which may cause some error s.
z
I think that might be a bug in your code and could affect the reliability on your final result.
Could you check on that?
It seems you used pytorch 1.0.1 in your project. Pytorch has added inplace detection since 1.5 which makes your code
at https://github.com/katerakelly/oyster/blob/master/rlkit/torch/sac/agent.py throw an inplace operation error since the latent vector
z
is not detached from the action generation. It will cause policy loss function updatez
multiple times in one optimization which may cause some error s.I think that might be a bug in your code and could affect the reliability on your final result.
Could you check on that?