reproduce the results - Githubissues

ChenyangRan commented 5 months ago

Hi, since the panda-gym cannot set the random seed as gym, where you can use env.set(seed) to reproduce the results. When I use env.reset(seed=10), if the random seed is the same, I get the same return value, such as desired_goal. But if I don't set the seed, I can't guarantee the consistency of the state of the subsequent env.resets. Is there a way to guarantee the consistency of the state of each subsequent reset just like gym?

qgallouedec commented 5 months ago

Hey, thanks for the question. It seems like a bug, I'll fix it.

qgallouedec commented 4 months ago

I've only managed to solved the issue with some environments. I can't solve it for the others. The discussion is also ongoing here: https://github.com/Farama-Foundation/Gymnasium/issues/1111

[x] PandaReach-v3
[x] PandaReachJoints-v3
[x] PandaReachDense-v3
[x] PandaReachJointsDense-v3
[x] PandaSlide-v3
[x] PandaSlideJoints-v3
[x] PandaSlideDense-v3
[x] PandaSlideJointsDense-v3
[ ] PandaPush-v3
[ ] PandaPushJoints-v3
[ ] PandaPushDense-v3
[ ] PandaPushJointsDense-v3
[ ] PandaPickAndPlace-v3
[ ] PandaPickAndPlaceJoints-v3
[ ] PandaPickAndPlaceDense-v3
[ ] PandaPickAndPlaceJointsDense-v3
[ ] PandaStack-v3
[ ] PandaStackJoints-v3
[ ] PandaStackDense-v3
[ ] PandaStackJointsDense-v3
[ ] PandaFlip-v3
[ ] PandaFlipJoints-v3
[ ] PandaFlipDense-v3
[ ] PandaFlipJointsDense-v3

To reproduce

import panda_gym
import gymnasium as gym
from gymnasium.utils.env_checker import data_equivalence

env = gym.make("PandaPickAndPlace-v3")

action_0 = env.action_space.sample()
action_1 = env.action_space.sample()

_, _ = env.reset(seed=10)
obs_00, _, _, _, _ = env.step(action_0)
_, _ = env.reset()
obs_01, _, _, _, _ = env.step(action_1)

_, _ = env.reset(seed=10)
obs_10, _, _, _, _ = env.step(action_0)
_, _ = env.reset()
obs_11, _, _, _, _ = env.step(action_1)

assert data_equivalence(obs_00, obs_10) 
assert data_equivalence(obs_01, obs_11)

qgallouedec / panda-gym

reproduce the results #94