It seems like two consecutive runs of the AnymalTerrain task with the same parameters and random seed yield very different results. In my own code I narrowed it down to the use of trimesh vs plane as the terrain - plane is perfectly deterministic while the training results on trimesh can vary wildly despite the same parameters, random seed and initial conditions. It seems to be the case in IsaacGymenvs - Anymal task is deterministic while AnymalTerrain isn't.
Is there a way to address this? It makes it very hard to tune rewards and hyperparameters, especially on more complex tasks where the differences between runs are pretty big.
It seems like two consecutive runs of the AnymalTerrain task with the same parameters and random seed yield very different results. In my own code I narrowed it down to the use of trimesh vs plane as the terrain - plane is perfectly deterministic while the training results on trimesh can vary wildly despite the same parameters, random seed and initial conditions. It seems to be the case in IsaacGymenvs - Anymal task is deterministic while AnymalTerrain isn't.
Is there a way to address this? It makes it very hard to tune rewards and hyperparameters, especially on more complex tasks where the differences between runs are pretty big.