rlworkgroup / garage

A toolkit for reproducible reinforcement learning research.
MIT License
1.86k stars 310 forks source link

Issue with inputs flattening in DDPG+HER #2091

Open mazpie opened 3 years ago

mazpie commented 3 years ago

I am experiencing a problem with version v2020.06.3 training with DDPG+HER on the FetchReach-v1 environment. Using the provided example, I get an index error:

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

which comes from

File (...)/python3.7/site-packages/garage/tf/policies/continuous_mlp_policy.py", line 144, in get_actions observations = self.observation_space.flatten_n(observations)

from what I could see the problem comes from the fact that the OffPolicyVectorizedSampler already flattens the observations at line 113:

input_obses = obs_space.flatten_n(obses)

then, the observation_space used by the ContinuousMLPPolicy is different from the space of the observations received from the sampler.

ryanjulian commented 3 years ago

@nicolengsy please triage!

nicolengsy commented 3 years ago

@mazpie Thanks for opening the issue, we'll be addressing it on our upcoming milestone!