Closed Abermal closed 2 years ago
All of our algorithms expect a VecEnv
(see SB3 docs). You don't need to write any more code for your environment: just write something like venv = DummyVecEnv([lambda: MyEnv()])
and pass in venv
instead of MyEnv()
.
Stable Baselines does do some magic Env
to VecEnv
wrapping (such as the extract you paste), but imitation
does not do this.
Hope this helps.
It helped, but I thought this wrapping is supposed to be performed automatically by stable-baselines.
There is another error though... \
in finish_trajectory
method my last observation has its batch dimension preserved for some reason, which leads to an error in np.stack(arr_list, axis=0)
.
My reset
and step
methods return the observation in the same format [1, obs_dim]
.\
Expert demonstrations follow this convention too.\
This error doesn't happen in a CartPole-v0
example.
Any ideas on how to fix this? Thanks once again for such a quick answer!
imitation
is not stable baselines. Just because they wrap the environment doesn't mean an unwrapped environment will work with our code.
I'd suspect an issue with what reset()
is returning, are you sure it's not including an extra dimension? Otherwise I don't see why this environment would not work while others would. If you proivde a minimal example to reproduce the error I'm happy to try to debug.
My reset and step methods return the observation in the same format
[1, obs_dim]
.
I think that might be the issue? step()
and reset()
observations should have the shape declared in the environment's observation_space
, so just (self._sett.n_sensors * 3,)
without the leading singleton dimension in your case. I haven't thought about if/why this would lead to the error you're seeing but you should remove the leading 1 in any case.
Thanks @ejnnr, it worked!
Hello, first of all, thank you for the repo. I'm using the following environment:
I have created a simple environment for a 2d game I wrote and I would like to apply AIRL to obtain the reward function from my demonstrations.
I fixed several things in my env and the way I store the demonstrations so that it matches your repo, but I got stuck at the following error.
I managed to debug it to the following place in a
BaseAlgorithm
class:Correct me if I'm wrong, but the
DummyVecEnv
wrapper class is supposed to give my env asynchronous functionality, right?\ I don't have much experience in async programming. Is there any way for me to use your repository without writing thestep_async
method?Thank you for your anwer!