qgallouedec / panda-gym

Set of robotic environments based on PyBullet physics engine and gymnasium.
MIT License
583 stars 118 forks source link

reproduce the results #94

Open ChenyangRan opened 5 months ago

ChenyangRan commented 5 months ago

Hi, since the panda-gym cannot set the random seed as gym, where you can use env.set(seed) to reproduce the results. When I use env.reset(seed=10), if the random seed is the same, I get the same return value, such as desired_goal. But if I don't set the seed, I can't guarantee the consistency of the state of the subsequent env.resets. Is there a way to guarantee the consistency of the state of each subsequent reset just like gym?

qgallouedec commented 5 months ago

Hey, thanks for the question. It seems like a bug, I'll fix it.

qgallouedec commented 4 months ago

I've only managed to solved the issue with some environments. I can't solve it for the others. The discussion is also ongoing here: https://github.com/Farama-Foundation/Gymnasium/issues/1111

To reproduce

import panda_gym
import gymnasium as gym
from gymnasium.utils.env_checker import data_equivalence

env = gym.make("PandaPickAndPlace-v3")

action_0 = env.action_space.sample()
action_1 = env.action_space.sample()

_, _ = env.reset(seed=10)
obs_00, _, _, _, _ = env.step(action_0)
_, _ = env.reset()
obs_01, _, _, _, _ = env.step(action_1)

_, _ = env.reset(seed=10)
obs_10, _, _, _, _ = env.step(action_0)
_, _ = env.reset()
obs_11, _, _, _, _ = env.step(action_1)

assert data_equivalence(obs_00, obs_10) 
assert data_equivalence(obs_01, obs_11)