Closed JennoMai closed 2 years ago
HI @JennoMai, sorry for the delay, I forgot about this issue.
We haven't in fact experimented with multi-dimensional action spaces, but one thing I can say for sure is that the dynamics model used in PETS won't work out of the box for this. Notice this line. This refers to a 1-D model which is hard-coded to assume that both states and actions tensors are one dimensional, and constructs model inputs by concatenating the two; this is the standard setup for proprioceptive control problems for which PETS and MBPO were initially proposed for.
For your particular application, you can probably use the main PETS skeleton, but you would need to replace the model architecture for something more appropriate to your application. You can take a look at our PlaNet implementation for an example of a different kind of model receiving multi-dimensional (visual) state data, but which uses the same planning algorithm as PETS.
Steps to reproduce
Observed Results
In the traceback below, I have a length 9 observation space and a length 2 action space; I believe the code might be concatenating the two together, but only a length 1 set of actions is being generated.
Expected Results
This runtime error shouldn't be thrown.
Relevant Code
The gym environment I'm using is very messy right now but can be found here, and the corresponding training code is here. However, the code depends heavily on the Botnet simulator, so it may be easier to try to replicate using the MultiAgentEnv here?