[Question] Usage with Torch RL?

HagaiHargil commented 1 year ago

Question

I was trying to use some MiniWorld environment with TorchRL, which provides implementations for some of the more popular algorithms around, but I stumbled upon some issues. TorchRL expects specific key names in the returned results after each step, but these are incompatible with the key names used here.

For example, it looks for an "observation" key in the returned dictionary, but FourRooms responds with "pixels", which I believe is similar. If you try to change that expected key name it results in another error, when running init_stats (RuntimeError: mean(): could not infer output dtype. Input dtype must be either a floating point or complex dtype. Got: Byte).

Any advice on this? I'm mostly trying to use PPO and other more advanced policies with the MiniWorld environments.

BolunDai0216 commented 1 year ago

@HagaiHargil, personally I have not used TorchRL before so I'm not sure what changes are required to make it run, but if you only want to train a PPO agent on Miniworld environments, StableBaseline3 is fully supported.

If you want the environment to output a dictionary instead, you can just create a wrapper, see here for an example.

pseudo-rnd-thoughts commented 1 year ago

@HagaiHargil It should be possible for you to transform the observations from pixels to observation with a custom wrapper

HagaiHargil commented 1 year ago

Thanks for your comments, I was unfamiliar with StableBaseline, I'll try that out. I might go for the wrapper solution as well if necessary.

Farama-Foundation / Miniworld

[Question] Usage with Torch RL? #105

Question