graspnet / graspnet-baseline

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)
https://graspnet.net/
Other
475 stars 142 forks source link

Grasp Score greater than 1 #23

Closed FrancescoRosa3 closed 2 years ago

FrancescoRosa3 commented 3 years ago

Hi everyone. I am using your work in order to build a vision-based grasping system for my master thesis. By running the demo.py and by observing the grasp score, there are grasps that have a score greater than 1, but from your paper, I would have expected a max grasp score of 1. So, why are there grasps scores greater than 1? Thanks.

chenxi-wang commented 3 years ago

Hi, in original baseline implementation we compute gt scores by s = 1.1 - u, with u in [0.0, 1.0]. In current implementation, we use s = 0 when u = 0 and s = -ln(u) when u is in [0.1, 1.0]. With this definition, gt scores are ranging from 0 to ln(10). The definition can be found here.

FrancescoRosa3 commented 3 years ago

Thanks for your quick response. It is clear, so the grasp score ranges from 0 to ln(1.0), is it correct? I have another question, your work seems to be "gripper agnostic", but from the grasp definition, I understand the following:

Then there is an offset of 2 cm, that should define the distance from the origin of the gripper coordinate and the palm of the gripper, but if I have a gripper that does not have this offset, (it may be lower or greater), do I have the possibility to set this parameter?

chenxi-wang commented 3 years ago

Grasp score ranges from 0 to ln(10) (not ln(1.0)). Depth is the distance from the origin of gripper coordinate frame to the gripper tip, and contact points usually lie between them. Detailed definition can be found in graspnetAPI.

The height and 2cm offset are used in collision detection and graspnet evaluation, and can be modified according to your requirements in the real experiments. The coordinates of gripper tip computed from modified gripper paramters and original graspnet format needs to be aligned.

FrancescoRosa3 commented 3 years ago

I am sorry for the typo. Probably I am missing something but based on the comment: "Depth is the distance from the origin of gripper coordinate frame to the gripper tip, and contact points usually lie between them", this depth should be constant, because the origin of the gripper coordinate is fixed as well as its distance from the gripper tip, but after running the demo I have not observed this behavior. What am I missing? Moreover, what do you mean with "modified gripper paramters and original graspnet format needs to be aligned"? Thank you for your kind.

chenxi-wang commented 3 years ago

In the output parameters, coordinates of grasp points are not equivalent to the gripper bottom. The position of gripper tip is computed by grasp point and depth (not a constant), and the position of the bottom is computed using tip coordinate and the fixed gripper length (a variable you can set by yourself).

FrancescoRosa3 commented 3 years ago

Thank you for all your clarifications. After reading very carefully both the paper and the code, and thought about your comment, I have the following observations to do:

  1. The translation field of a Gripper is the coordinate of the grasp point in the image coordinate frame, isn't it?
  2. The offset of 2 cm should be equal to hmin = -0.02.
  3. The depth can assume 4 values equal to hmax_list=[0.01,0.02,0.03,0.04], but I do not understand why this value should be "the distance from the origin of gripper coordinate frame to the gripper tip". It seems to me, that this value is used only here, and then we take the height that produces the grasp with the highest score here. Moreover, there is no "gripper coordinate frame", what am I missing about the depth? What is its physical meaning?
  4. If I want to modify the code in order to take into account my gripper geometry, I should modify:

Thanks for your kind.

chenxi-wang commented 3 years ago

The translation field of a Gripper is the coordinate of the grasp point in the image coordinate frame, isn't it?

It should be camera coordinate frame, which is transformed from image coordinate frame using this function.

The depth can assume 4 values equal to hmax_list=[0.01,0.02,0.03,0.04], but I do not understand why this value should be "the distance from the origin of gripper coordinate frame to the gripper tip".

It is a definition, the translation of gripper tip is computed by the depth and grasp point.

If I want to modify the code in order to take into account my gripper geometry, I should modify: ...... hmin and hmax_list.

hmax_list is related to the depths sampled in grasp labels, which should not be modified.

FrancescoRosa3 commented 3 years ago

Thank you, it seems clearer to me now. So, if I want to use this method for my experiments I have to modify only those variables that in the code refer to GRIPPER dimensions, is it correct? These variables affect mainly the collision detection since the inference seems to be gripper independent, isn't it? Thanks for your kindness.

chenxi-wang commented 3 years ago

Yes, most of the gripper parameters are used in collision detection.