Closed atyshka closed 6 years ago
You are not limited. In fact, taking several views is likely to improve grasp detection performance, so you're on the right track!
The point cloud in the video was taken from two different views (check out the beginning of this video).
The parameter files for the network were trained on data from the Bigbird dataset. The 53 and 90 degrees refer to how the views were taken for that training.
I'm attempting a pick and place demo very similar to the one in your video with a UR10 robot with a robotiq gripper and a time of flight sensor mounted on the gripper. I want a detailed view of the objects for good segmentation and so I want to move the arm around and combine several pointclouds into a more detailed one. This wasn't shown in the video but given the great detail of the pointcloud I'm assuming this is how you did it. This is fairly similar to using two separate depth cameras, which you described in the readme. However, I'm trying to figure out why you needed a different caffe model for the two camera setup. Assuming the data is properly transformed and merged, the pointcloud data should be the same regardless of what perspectives the cameras are at, correct? I'm trying to figure out why I would be limited to taking only two snapshots at 90 or 53 degrees.