Closed nicola-pesavento closed 3 years ago
See documentation:
One current limitation of recurrent policies is that you must test them with the same number of environments they have been trained on.
See this comment for a possible solution: https://github.com/hill-a/stable-baselines/issues/166#issuecomment-502350843
Edit: See post below. The easiest solution likely is to update stable-baselines with `pip install --upgrade git+https://github.com/hill-a/stable-baselines``.
Related PR and issue: https://github.com/hill-a/stable-baselines/pull/1017 and https://github.com/hill-a/stable-baselines/issues/1015
Thank you very much @Miffyli and @araffin ! I upgraded stable-baselines to the latest version and now it seems to work like a charm, great work!
Hi, I'm trying to train and evaluate a
A2C
model using 4 parallel environments for the training and just 1 environment for the evaluation. The code I'm using is the following:The training procede good until the
eval_callback
is triggered, at that moment the following error occurs:In particular the last error message is:
Cannot feed value of shape (1, 15) for Tensor 'input/Ob:0', which has shape '(4, 15)'
It seems that the model still require to receive inputs from 4 parallel environments during the evaluation process (it's not possible to use
EvalCallback
withn_envs
> 1).Any suggestion? Thanks in advance!
Bests, Nicola