A list of future improvements with short description. These are currently not part of Design and decisions (https://github.com/AndrejOrsula/drl_grasping/issues/50). These issues might be created and worked on if time allows it. However, their priority is lower than any open issues as they are likely out of scope for this project.
Adding noise to observations is simple. More complex augmentation might be a bit difficult with octree structure - what data to augment, e.g. rgb + depth image or point cloud?
Control force of grasps
Gripper currently applies the same force to all objects. However, this force might be too excessive for several objects and break them in real life. For example, this problem could be formulated as an energy minimisation problem and incorporated into the reward function.
Random position of scene (ground plane + object)
Agent trained in an environment where the position of ground plane is randomized might provide better generalisation and allow Sim2Real transfer even if real setup does not match the simulation environment. The use of Octree (transformed into robot coordinate frame) allows this feature.
Extend octree to cover the entire reachable workspace of the robot
With this addition, no extra information about the origin of octree would need to be included as pose of all observed surfaces would already have spatial encoding. Therefore, the agent should be able to grasp objects located anywhere within its reach (if properly generalised). This would obviously require much more training, computation and memory (since depth of octree would need to be further increased).
Geometry of ground plane (heighmap)
Variety in geometry of ground plane further improves generalisation. Currently, only horizontal flat plane is used.
A policy that grasps one specific object, e.g. based on extra observation of current object's position given as goal (and modify reward of course). Grasping specific object class from clutter is also pretty useful.
Directional (ray-based) grasping
A policy that grasps object(s) from one specific direction (or area) instead of random one, e.g. grasp tool by handle. For example, a ray could be given as an extra observation to indicate goal and provide more reward the closer it matches.
A list of future improvements with short description. These are currently not part of Design and decisions (https://github.com/AndrejOrsula/drl_grasping/issues/50). These issues might be created and worked on if time allows it. However, their priority is lower than any open issues as they are likely out of scope for this project.
Parallel environments
Make environment more scalable.
Data augmentation
Data augmentation can go a long way - https://arxiv.org/pdf/2004.14990.pdf
Adding noise to observations is simple. More complex augmentation might be a bit difficult with octree structure - what data to augment, e.g. rgb + depth image or point cloud?
Control force of grasps
Gripper currently applies the same force to all objects. However, this force might be too excessive for several objects and break them in real life. For example, this problem could be formulated as an energy minimisation problem and incorporated into the reward function.
Random position of scene (ground plane + object)
Agent trained in an environment where the position of ground plane is randomized might provide better generalisation and allow Sim2Real transfer even if real setup does not match the simulation environment. The use of Octree (transformed into robot coordinate frame) allows this feature.
Extend octree to cover the entire reachable workspace of the robot
With this addition, no extra information about the origin of octree would need to be included as pose of all observed surfaces would already have spatial encoding. Therefore, the agent should be able to grasp objects located anywhere within its reach (if properly generalised). This would obviously require much more training, computation and memory (since depth of octree would need to be further increased).
Geometry of ground plane (heighmap)
Variety in geometry of ground plane further improves generalisation. Currently, only horizontal flat plane is used.
Let agent terminate episode
Semantic grasping
A policy that grasps one specific object, e.g. based on extra observation of current object's position given as goal (and modify reward of course). Grasping specific object class from clutter is also pretty useful.
Directional (ray-based) grasping
A policy that grasps object(s) from one specific direction (or area) instead of random one, e.g. grasp tool by handle. For example, a ray could be given as an extra observation to indicate goal and provide more reward the closer it matches.