Closed stratisMarkou closed 5 years ago
This is because when you make the environment using gym.make("Pendulum-v0") it returns a TimeStamp class not the actual PendulumEnv class. You can do that by using env.env.state as in the below code
print('state after reset:', env.state)
action = np.array([0.])
_, _, _, _ = env.step(action)
print('state after reset and step:', env.state)
env.env.state = np.array([0.5, -1.])
print('state after assignment:', env.state)
_, _, _, _ = env.step(action)
print('state after assignment and step:', env.state)
Output is
state after reset: [0.70864103 0.82823708]
state after reset and step: [0.77445798 1.31633901]
state after assignment: [ 0.5 -1. ]
state after assignment and step: [ 0.46797846 -0.64043085]
Great, works as desired, thanks!
Do the classical control environments support sampling at random from the whole space of allowed states? For example in Pendulum,
env.reset()
resets the state to a random angle between -pi and +pi and the velocity from -1 to +1, but I would like that to be -pi and +pi and -8 and +8 to match the allowed states of the environment.I've tried a naive manual setting of
env.state
:but after manually resetting the state, it stops evolving:
Any ideas why this is happening or workarounds? Help would be much appreciated :)