Closed floringogianu closed 1 year ago
hi floringogianu,
Thank you for making this pr.
Would you like to explain why this done = not (np.isfinite(state).all() or np.abs(state[1]) > .2)
can fix it? In my opinion, we should use done = not np.isfinite(state).all() or (np.abs(state[1]) > .2).any()
. Because np.abs(state[1]) > .2
gives a np.array here. Correct me if I'm wrong. Thank you.
@floringogianu Thanks for your contribution! @LostXine seems to have a valid argument here, no?
@benelot Sorry for not replying for a while, somehow I missed @LostXine observation. I haven't worked with pybullet
since last year but I'll find the time to install it again and check this out. If np.abs(state[1]) > .2
is indeed an array (although I don't remember that being the case) then yes, @LostXine solution is the right one.
On
env.step()
thedone
signal should be abool
, not a tuple. Also check gym.envs.mujoco.inverted_pendulum.py.