So, I manually fix the broadcasting in each environment. e.g. In the envs/ant Line 204-206, I change the code to self.state.joint_q.view(self.num_envs, -1)[env_ids, 3:7] = self.start_rotation.clone().unsqueeze(0).expand(len(env_ids), -1) self.state.joint_q.view(self.num_envs, -1)[env_ids, 7:] = self.start_joint_q.clone().unsqueeze(0).expand(len(env_ids), -1) self.state.joint_qd.view(self.num_envs, -1)[env_ids, :] = torch.zeros(size=(len(env_ids), self.num_joint_qd), device = self.device)
After these changes, I run the experiments with/without torch_deterministic=True, e.g. below is the ant test where the blue one is without torch_deterministic=True and orange one with torch_deterministic=True
The non-deterministic run is similar to the paper results, however, for the deterministic setting, the rewards remain unchanged.
Does someone have ideas about what else the issue may torch_deterministic=True bring? Thank you very much!
Thank you for providing this awesome repo!
I try to make results consistent between different runs via
seeding(seed, torch_deterministic=True)
.It is known torch has some broadcasting issue with deterministic algorithm: https://github.com/pytorch/pytorch/issues/79987
So, I manually fix the broadcasting in each environment. e.g. In the envs/ant Line 204-206, I change the code to
self.state.joint_q.view(self.num_envs, -1)[env_ids, 3:7] = self.start_rotation.clone().unsqueeze(0).expand(len(env_ids), -1) self.state.joint_q.view(self.num_envs, -1)[env_ids, 7:] = self.start_joint_q.clone().unsqueeze(0).expand(len(env_ids), -1) self.state.joint_qd.view(self.num_envs, -1)[env_ids, :] = torch.zeros(size=(len(env_ids), self.num_joint_qd), device = self.device)
After these changes, I run the experiments with/without
torch_deterministic=True
, e.g. below is the ant test where the blue one is withouttorch_deterministic=True
and orange one withtorch_deterministic=True
The non-deterministic run is similar to the paper results, however, for the deterministic setting, the rewards remain unchanged.
Does someone have ideas about what else the issue may
torch_deterministic=True
bring? Thank you very much!