PKU-EPIC / UniDexGrasp

Official code for "UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy" (CVPR 2023)
102 stars 10 forks source link

Choose "random" or "grid" for function generate_queries in ipdf_network.py #9

Closed waqc closed 9 months ago

waqc commented 10 months ago

Thanks for this amazing repo! I have a small question about the code related to the paper. For the generate_queries function in dexgrasp_generation/network/models/graspipdf/ipdf_network.py, there are two modes "random" and "grid". I think in the paper, GraspIPDF mentions generating rotation by "grid" mode but the code is using "random" mode. Is that because the "random" mode has better performance?

Also for the "random" mode, the rotation matrices are directly generated by pytorch3d.transforms.random_rotations, will these rotations be the uniform distribution over SO(3)?

Thanks a lot!

XYZ-99 commented 9 months ago

I'll answer the second question first. Yes, these rotations are sampled from a uniform distribution over SO(3).

For the first question, let me elaborate on that. (Just in case of confusion from overridden terms, I will put quotes around "grid" when I'm referring to the "grid" mode; otherwise I'm using the term—grid—to simply denote a discretization of SO(3) of shape [num, 3, 3] in our code.)

A little background context:

  1. The "grid" mode will generate an equivolumetric grid over SO(3), which is deterministic and perfectly even over SO(3). The "random" mode samples a given number of rotations from a uniform distribution over SO(3). If the number is sufficiently large, either mode is expected to provide a decent discretization of SO(3).
  2. Whenever we are predicting the probability of an arbitrary rotation, we will rotate the grid to align with the input rotation so that one rotation from that grid is exactly the input rotation. And then we will query the network with rotations from that grid and normalize the outputs to get the probabilities.

Now if we choose the "grid" mode during training, for each training sample, its probability is normalized over the same grid. This will not be a problem if:

  1. the number of training samples is sufficiently large;
  2. the distribution of rotations in the training samples is sufficiently even across SO(3).

However, these conditions might not always hold true because the line for "sufficiently" is fairly blurry. Under those circumstances, you can imagine it's theoretically possible that the probabilities of rotations outside those grids are out of control and can be a troublemaker when we sample during testing.

If we use the "random" mode, for each training sample, the probability is normalized over different grids of rotations, which effectively helps smooth the probabilistic distribution over SO(3).

When we sample from GraspIPDF, we use a pre-generated grid under the "grid" mode for efficiency.

I hope I have delivered my points clearly. Feel free to follow up in this issue otherwise!

waqc commented 9 months ago

Thanks a lot for the detailed explanation. I fully understand it now.