The benefit of feeding the RGB data

jundengdeng commented 5 years ago

Hi Dougsm,

First of all, thank you for sharing the GGCNN code. I have one question: does adding RGB data to the grasping network improve the grasping?

Best,

Jun

youkenhou commented 5 years ago

I trained the network with only depth for 30 epochs and the best iou is 74%. With RGB being added, the best iou reaches 78% (although iou is not 100% equal to the performance of grasping in the real world). In my opinion, sometimes objects are so small that depth maps contain few information. RGB can provide the network with more information in this kind of situation.

dougsm commented 5 years ago

Overall I found it to not help very much, and in some cases was a negative. I agree with what @youkenhou said, that in the cases where there is minimal depth data it helps, and it improves performance within the dataset. However, there were two main issues with using RGB that I found in practice:

The network can get confused by objects that have strong colour gradients, like logos etc. which almost outweighs the small number of benefits in the cases of poor depth information.
With RGB, transferring to a robot with a different camera in a different setting (e.g. with a different or cluttered/textured background) the RGB doesn't transfer well (for example, picking out of a red Amazon tote). There are things you can probably do to avoid this, but since depth doesn't have this problem I didn't look into it further.

jundengdeng commented 5 years ago

Thanks @dougsm and @youkenhou for your inputs. We found that our method highly depends on the quality of the depth camera. Currently, we are using D435 and D415, which restricts our grasping method only to grasp big objects in a cluttered case. If we are going to grasp one object with 2 cm width or one object with black surface (for example cell phone), then the depth camera can't give the depth information much. In this case, I'm seeking some solution. I was thinking that maybe RGB would help me.

dougsm / ggcnn

The benefit of feeding the RGB data #10