Closed MijnheerD closed 4 years ago
After trying different things, it seems to me that the only way of solving this is to also use a CuriosityWrapper inside the EvalCallback
yes, you need to use the same wrappers. But what you can do is add a test_mode
argument to that wrapper (that you set to True before passing the env to the Callback) so it does nothing during evaluation.
Also, it seems that you are using the same env multiple times: do:
env = DummyVecEnv([lambda: Monitor(CustomEnv(reward_func=FUNCTION), log_dir, allow_early_resets=True) for _ in range(num_cpu)])
instead of:
Ambiente = CustomEnv(reward_func=FUNCTION)
env = DummyVecEnv([lambda: Monitor(Ambiente, log_dir, allow_early_resets=True) for _ in range(num_cpu)])
Thank you for your help! It works now :) For anyone who is also stuck on this, here is what I did:
filter_end_of_episode
was put to True. This set done=True
by default, never ending the loopCuriosityWrapper
as it was returned as an 0D array. By reshaping it to a 1D array it worked just fine
I am running into an issue when trying to use
EvalCallback
to periodically save the best model learned by aPPO2
agent. The problem is that I use a (custom) gym environment inside theEvalCallback
and a wrapper inside the model. When running it raises the error:I have tried to explain the context down below, but essentially my question is this: can we use wrappers as the
env
when creating callbacks? And are we supposed to? Or is this a sign that the code has another issue?Thank you in advance!
System Info Describe the characteristic of your environment:
Additional context I am working on a project to train an agent to cancel out an incoming wave, using only a few observation points. For this, we have written a custom gym environment
Advection_training
to train the agent in. I used thecheck_env
function to check the environment and it reports no issues. We usePPO2
to train the agent and save the best model periodically usingEvalCallback
. We used aSubprocVecEnv
inside in the model and the gym environment itself inside the callback, which raised a warning but no error. This is the piece of code we used and runs without an error:However, we then wanted to try and incorporate the
CuriosityWrapper
created by @NeoExtended in #309 . He derived the wrapper from the classBaseTFWrapper
. I modified the code as follows:This raises an error when trying to callback:
After trying different things, it seems to me that the only way of solving this is to also use a
CuriosityWrapper
inside theEvalCallback
. However, when I do this the code seems to run very slow. Without the callback, its finishes in a couple of minutes, but with the callback it was still running after 1 hour.