tnikolla / robot-grasp-detection

Detecting robot grasping positions with deep neural networks. The model is trained on Cornell Grasping Dataset. This is an implementation mainly based on the paper 'Real-Time Grasp Detection Using Convolutional Neural Networks' from Redmon and Angelova.
Apache License 2.0
238 stars 84 forks source link

About multiplication factors 0.35 & 0.47 #2

Closed wuguangbin1230 closed 6 years ago

wuguangbin1230 commented 6 years ago

Hi author,

In your program, there are values: 0.35 and 0.47. What do they mean?

Thank you!

`def bboxes_to_grasps(bboxes):

converting and scaling bounding boxes into grasps, g = {x, y, tan, h, w}

box = tf.unstack(bboxes, axis=1)  #bboxes <tf.Tensor 'batch_join:1' shape=(64, 8) dtype=float32>
x = (box[0] + (box[4] - box[0])/2) *0.35
y = (box[1] + (box[5] - box[1])/2) *0.47
tan = (box[3] -box[1]) / (box[2] -box[0]) *0.47/0.35
h = tf.sqrt(tf.pow((box[2] -box[0])*0.35, 2) + tf.pow((box[3] -box[1])*0.47, 2))
w = tf.sqrt(tf.pow((box[6] -box[0])*0.35, 2) + tf.pow((box[7] -box[1])*0.47, 2))
return x, y, tan, h, w`
tnikolla commented 6 years ago

Hi! Sorry that I'm so late with the answer.

The original images are 640x480 pixels and the vertex of the ground truth rectangle grasps should be scaled when scaling the images. For example, if the x-coordinate of one grasp vertix is 200 in the original dataset then if I scale the images to 224x224 the annotated vetex is to be scaled too:

x_scaled = 200 * 224/640 = 200 * 0.35

The corresponding y value for the same vertex scales with a factor of 0.47.

I also uploadet the models. Regards!