YanjieZe / 3D-Diffusion-Policy

[RSS 2024] 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
https://3d-diffusion-policy.github.io
MIT License
402 stars 37 forks source link

Point Cloud Processing #50

Closed chereddy closed 3 months ago

chereddy commented 3 months ago

Hi,

I had a couple of questions about the way you process the point cloud.

  1. It seems that the processing to get the camera position is incorrect

    cam_body_id = self.sim.model.cam_bodyid[cam_i]
    cam_pos = self.sim.model.body_pos[cam_body_id]

    I found that self.sim.model.cam_bodyid = [0, 0, 0, 0, 23, 23], and cam_pos = [0, 0, 0] for body id of 0.

    I believe the correct way to get the cam positions is

    cam_id = self.sim.model.camera_name2id(self.cam_names[cam_i])
    cam_pos = self.sim.model.cam_pos[cam_id]

    Was this intentional? It doesn't really matter as this work only uses one camera, but it would be useful to be able to extend it to multiple cameras.

  2. How did you determine the point cloud transformation and also the scene bounds for cropping?

YanjieZe commented 3 months ago

Hi, thank you for your interest!

I think you are correct. I also find that the camera params are not quiet right before, but yes, since we use only one camera, that is not a problem. I think if your fix is working, that would be very great to extend to multiple cameras! If you have updates on succeeding in fusing multiple cameras, it would be good to share some updates:)

For your second question, I think it is just determined by visualization. You could use our point cloud visualizer (also contained in this codebase) for debug.