Meowuu7 / GeneOH-Diffusion

[ICLR'24] GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion
https://meowuu7.github.io/GeneOH-Diffusion/
MIT License
76 stars 4 forks source link

Calculation of Contact Point #3

Closed zhoujun-7 closed 3 months ago

zhoujun-7 commented 3 months ago

Hi Xueyi!

Thanks for sharing this interesting work. The proposed method relies on the contact points on the object. But the calculation of contact points seems to be based on ground truth hand pose rather than the noisy one. If ground truth hand pose is strictly unavailable, can the proposed method still work?

Looking forward to your reply.

BR, Jun

Meowuu7 commented 3 months ago

Hello Jun!

Thank you for your careful review of our code. Regarding your observation about the calculation of contact points, I believe there might be a misunderstanding. It seems your question pertains to how we calculate the base_pts within the GRAB_Dataset_V19 class in the file dataset_ours_single_seq.py. I guess it is the following code

dist_rhand_joints_to_obj_pc = torch.sum(
    (rhand_joints.unsqueeze(2) - object_pc_th.unsqueeze(1)) ** 2, dim=-1
)

that let you regard that we use GT rhand_joints for calculating the contact region. However, it's important to note that GRAB_Dataset_V19 serves as the data loader for the first stage, where only hand trajectory information is denoised. While we do compute contact related information within this class, I want to emphasize that we don't use them in the denoising process.

In the subsequent stage where the contact related representation is denoised, the data loader shifts to GRAB_Dataset_V19_From_Evaluated_Info. Here, we leverage the hand trajectory predicted by the initial stage to compute contact information, as demonstrated in the following code:

if self.wpredverts:
    pert_rhand_joints = self.predicted_hand_joints
    rhand_joints = self.predicted_hand_joints

    rhand_verts = self.predicted_hand_verts
    pert_rhand_verts =  self.predicted_hand_verts

    pert_rhand_joints = torch.matmul(
        pert_rhand_joints, object_global_orient_mtx_th
    ) + object_trcansl_th.unsqueeze(1)

    rhand_joints = torch.matmul(
        rhand_joints, object_global_orient_mtx_th
    ) + object_trcansl_th.unsqueeze(1)

    rhand_verts = torch.matmul(
        rhand_verts, object_global_orient_mtx_th
    ) + object_trcansl_th.unsqueeze(1)

    pert_rhand_verts = torch.matmul(
        pert_rhand_verts, object_global_orient_mtx_th
    ) + object_trcansl_th.unsqueeze(1)

where we set both rhand_joints and pert_rhand_joints to the predicted_hand_joints, which is loaded from the file args.predicted_info_fn.

The rationale behind computing contact-related information, even based on ground truth joints, within the GRAB_Dataset_V19 class in the file 'dataset_ours_single_seq.py' stems from its modification from the training data loader, GRAB_Dataset_V19, found in dataset_ours.py. During the training phase, only clean trajectories with contact information derived from ground truth hand trajectories are utilized. In the modified GRAB_Dataset_V19 in dataset_ours_single_seq.py, this calculation remains unchanged as it's directly adapted from the training data loader with some adjustments. However, it's important to re-emphasize that during the evaluation of the first stage, the contact information is no longer utilized.

I've modified the logic to eliminate such confusion:

dist_rhand_joints_to_obj_pc = torch.sum(
    (pert_rhand_verts.unsqueeze(2) - object_pc_th.unsqueeze(1)) ** 2, dim=-1
)
_, minn_dists_joints_obj_idx = torch.min(dist_rhand_joints_to_obj_pc, dim=-1) 

Again, please note that though we calculate contact information, we only denoise hand trajectories in the first stage where the contact related representation is neither used nor denoised. I'll remove the contact related representation calculation in the first stage's dataloder GRAB_Dataset_V19 to avoid similar confusions ultimately.

Thank you for bringing this to our attention! The codebase still requires some tidying up and lacks a thorough cleanup. I'll be making further commitments to make it more organized.

Best regards, Xueyi

zhoujun-7 commented 3 months ago

Thanks for your patience in answering my question. Sorry for the misunderstanding. My problem was solved.