Closed ipsec closed 2 years ago
The error occur while self._current_time_step receive env_info different of {}
Sometimes the env_info return with {'TimeLimit.truncated': array(True)}
So, in the https://github.com/HorizonRobotics/alf/blob/2e12066c7988b551204e12fc413d7fe6ec75e97f/alf/nest/nest.py#L74 the method fail because the size is wrong:
working step
[array([1.], dtype=float32),
array([0], dtype=int32),
array([[-0.02575713, -0.03751579, 0.10740692, 0.23765415]], dtype=float32),
array([1]),
array([1.], dtype=float32),
array([1], dtype=int32)]
not working step
[array([0.], dtype=float32),
array([0], dtype=int32),
array([ True]),
array([[-0.43213508, -0.35315424, 0.00651777, 0.4939275 ]], dtype=float32),
array([0]),
array([1.], dtype=float32),
array([2], dtype=int32)]
It's including array([ True]) in the flat_seq
The error occur while self._current_time_step receive env_info different of {}
Sometimes the env_info return with {'TimeLimit.truncated': array(True)}
So, in the
the method fail because the size is wrong: working step
[array([1.], dtype=float32), array([0], dtype=int32), array([[-0.02575713, -0.03751579, 0.10740692, 0.23765415]], dtype=float32), array([1]), array([1.], dtype=float32), array([1], dtype=int32)]
not working step
[array([0.], dtype=float32), array([0], dtype=int32), array([ True]), array([[-0.43213508, -0.35315424, 0.00651777, 0.4939275 ]], dtype=float32), array([0]), array([1.], dtype=float32), array([2], dtype=int32)]
It's including array([ True]) in the flat_seq
Hi @ipsec , sorry for the late reply. It seems that you are using (for reasons unknown) the TimeLimit
wrapper provided by Gym.
https://github.com/openai/gym/blob/master/gym/wrappers/time_limit.py
Whenever there is a timeout event, it will put "TimeLimit.truncated" field into the env_info. However, ALF doesn't support env_info with a variable field (between steps). Either you can write a gym wrapper to remove this field or always filling it in the env info.
There is another good reason why ALF tries to avoid using this TimeLimit wrapper explained here:
https://alf.readthedocs.io/en/latest/tutorial/environments_and_wrappers.html#step-type-and-discount
In general, please make sure to do
gym_spec = gym.spec(environment_name)
gym_env = gym_spec.make()
to avoid letting Gym wrap its TimeLimit wrapper.
You can take a look at the load()
of alf/environments/suite_gym.py
for an example.
Hi @hnyu.
I don't know why CartPole is using TimeLimit wrapper. My custom env is working fine.
Thanks
Hi @hnyu.
I don't know why CartPole is using TimeLimit wrapper. My custom env is working fine.
Thanks
Hi @ipsec , did you figure out the reason? I just ran the same training command on my side for CartPole, and it seemed working without any problem.
Hi,
I'm having an issue running CartPole-v1 gym game.
After some steps I'm receiving the error ValueError: all input arrays must have the same shape
I'm running CartPole with:
A similar (I think because of RuntimeError: Different lengths!) error while running LunarLander-v2:
I'm running LunarLander with: