rlworkgroup / garage

A toolkit for reproducible reinforcement learning research.
MIT License
1.84k stars 309 forks source link

IndexError: boolean index did not match indexed array along dimension 0; dimension is 2400 but corresponding boolean dimension is 1600 #2297

Open tianyma opened 2 years ago

tianyma commented 2 years ago

Hi there, I am currently running RL2_TRPO on my custom 2d navigation environment. My hyper parameters are

@click.option('--seed', default=1)
@click.option('--max_episode_length', default=200)
@click.option('--meta_batch_size', default=20)
@click.option('--n_epochs', default=10)
@click.option('--episode_per_task', default=4)

But an error occurs. I modified the meta_batch_size, then this error occurs during epoch #1 optimizing.

2021-07-29 16:06:06 | [rl2_trpo_jihuang_2d_nav] epoch #0 | Optimizing policy...
/lustre/S/matianyun/garage/src/garage/np/_functions.py:395: UserWarning: Creating a padded array with longer length than requested
  warnings.warn('Creating a padded array with longer length than '
Traceback (most recent call last):
  File "jihuang_2d_nav_rl2_trpo.py", line 94, in <module>
    rl2_trpo_jihuang_2d_nav()
  File "/lustre/S/matianyun/anaconda3/envs/jihuang/lib/python3.6/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/lustre/S/matianyun/anaconda3/envs/jihuang/lib/python3.6/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/lustre/S/matianyun/anaconda3/envs/jihuang/lib/python3.6/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/lustre/S/matianyun/anaconda3/envs/jihuang/lib/python3.6/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/lustre/S/matianyun/garage/src/garage/experiment/experiment.py", line 369, in __call__
    result = self.function(ctxt, **kwargs)
  File "jihuang_2d_nav_rl2_trpo.py", line 91, in rl2_trpo_jihuang_2d_nav
    meta_batch_size)
  File "/lustre/S/matianyun/garage/src/garage/trainer.py", line 396, in train
    average_return = self._algo.train(self)
  File "/lustre/S/matianyun/garage/src/garage/tf/algos/rl2.py", line 346, in train
    trainer.step_episode)
  File "/lustre/S/matianyun/garage/src/garage/tf/algos/rl2.py", line 364, in train_once
    self._inner_algo.optimize_policy(episodes)
  File "/lustre/S/matianyun/garage/src/garage/tf/algos/_rl2npo.py", line 72, in optimize_policy
    returns = self._fit_baseline_with_data(episodes, baselines)
  File "/lustre/S/matianyun/garage/src/garage/tf/algos/npo.py", line 474, in _fit_baseline_with_data
    returns = ret[val.astype(np.bool)]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 2400 but corresponding boolean dimension is 1600