Closed Sjey-Lyn closed 10 months ago
Thank you for your interest. We can get a prediction result using predict.py. In this branch, I simply loaded some depth images (cropped) from dexnet dataset and used it as an input to the network. To work with your data, crop depth image to have same shape as dexnet depth image and use it as an input to the network.
Thanks for your reply. In predict.py , the input of the network is depth image and pose, while the output seems to be a two-dimensional vector. I can't understand why the input has pose and what is the meaning of the output?
As Dex-Net 3.0 paper says the grasp pose a two-dimensional vector which consists of grasp depth (end-effecter depth from a camera) and orientation (which I remember as an angle based on the camera).
GQ-CNN simply classifies the input into [suction fail, suction success] which results the two-dimensional vector. We can simply take a argmax function to the output to get a result.
If I have a depth image of an object, how do I get the suction point to grasp the object?
Try to sample grasp candidates (several) on an object surface. In my case, I sampled the candidate using normal vectors from point cloud to find smooth surface. Then cropped the image which has the grasp candidate at the center (maybe 32x32. sorry for uncertainty as this repo is bit old). Lastly make a grasp pose vector based on an information you have.
I believe these are what you need to get a suction point.
Thanks for your work, but I didn't see the prediction in this part of the code