Closed CUN-bjy closed 3 years ago
It seems impossible to learn a single-step stacking task with a sparse reward(e.g. PandaStack). We need to apply the multi-step(hierarchical) structure!
PandaPush-v1