Sirui-Xu / InterDiff

[ICCV 2023] Official PyTorch implementation of the paper "InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion"
https://sirui-xu.github.io/InterDiff
MIT License
229 stars 9 forks source link

Explanation of rotation_v and rotation matrix #11

Closed chrenx closed 10 months ago

chrenx commented 10 months ago

Hi Sirui, The way you defined the _rotationv in dataset is like below [ cos, 0, -sin, 0, 1, 0, sin, 0, cos] which is slightly different from the one in the wiki. I thought this may cause counterclockwise rotation. Then you define rotation as np.linalg.inv(rotation_v).astype(np.float32), which I thought it would be clockwise rotation. My confusion occurs when i see that you calculate _loss_body_rot_vpast in train_diffusion_smpl. I don't quite get this part. Why did you choose this specific range of body_rot of prediction. (body_rot[1:self.args.past_len+1]-body_rot[:self.args.past_len])? Could you elaborate this a little bit more?

Thanks, chrenx

chrenx commented 10 months ago

I know that you explain this as smooth interaction over time in terms of velocity in the paper, but i don't quite understand why the way you slice the motion sequence can be considered as velocity regularization. Could you point me to a reference if possible?

Sirui-Xu commented 10 months ago

Hi @chrenx,

For the rotation matrix in data processing, (I guess this is unrelated to the loss function?). The goal here is to canonicalize the human body so that it always faces in the same direction horizontally in the first frame.

As for the loss function, this is for the smoothness of the past interaction sequence, we also have the smoothness promotion for the future generation by _loss_body_rot_vfuture. By segregating past and future interactions, with the past being more straightforward to reconstruct from input and the future requiring generation, we can fine-tune the model's performance, e.g., we can put more emphasis on the future generation by assigning larger loss weights.

Hope this answers your questions:) And feel free to let me know if there's anything else you'd like to know.

Best