Closed wzjscut closed 2 months ago
The task-relevant observation are already flattened and they can be processed by MLPPolicy, i.e. obs["task"]
.
Other observations are from onboard sensors, e.g. rgb, depth, etc., which generally speaking should not be flattened. You should consider using different encoders to process them separately, e.g. Conv2DPolicy.
I hope to write a reinforcement learning example by referring to the method in example/learning/navigation_policy_demo.py. I wish to use MlpPolicy to train the policy. But I found that the observations defined in env_base.py are all in Dict format. Is there any way to convert it to Box format?