PJLab-ADG / neuralsim

neuralsim: 3D surface reconstruction and simulation based on 3D neural rendering.
MIT License
582 stars 31 forks source link

What does joint_camlidar and joint_camlidar_equivalent_extr do ? #22

Closed andrearama closed 10 months ago

andrearama commented 11 months ago

Hello, thanks for the great project and for releasing the code! I have a doubt regarding two flags (for the waymo dataset, used also in StreetSurf). What does joint_camlidar and joint_camlidar_equivalent_extr do ?

In waymo_dataset.py, I see the description: joint_camlidar=True, # Joint all cameras and lidars; Set this to [true] if you needs to calibrate cam/lidar extrjoint_camlidar_equivalent_extr=True, # Set to [false] if you needs to calibrate cam/lidar extr

Does that mean that if we set joint_camlidar to true, it will refine the extrinsics of the cameras and for the lidars? That would be great, but not sure where this would happen, as the only difference that I see when setting the joint_camlidar=True or False is how you create the new_odict ( transform=c2v @ dposeversus transform=c2w )

Thanks for your time and availability!

Geniussh commented 10 months ago

It's not for extrinsics calibration on the fly. Instead, it's turned on to alleviate the burden of pose refinement when we do have poses of the vehicle available at the timestamps of different sensor acquisitions. When joint_camlidar_equivalent_extr is turned on, it means we acknowledge that c2w when the image is captured is different from the c2w derived from the ego vehicle's pose, due to the non-zero ego-motion and the lag between pose timestamp and image timestamp. Therefore, it takes into consideration the ego motion and uses the ego-compensated pose, i.e. c2v @ dpose, to construct the dataset. Otherwise, we ignore this and simply use the c2w when the image is captured.

In terms of pose refinement, I believe it's enabled by default according to the paper. And the implementation is rather simple: during training, just wrap the input pose as a torch variable that requires gradient, and use the calculated loss to update it when backpropagating.

I guess intrinsics/extrinsics calibration on the fly is not implemented/released yet because https://github.com/PJLab-ADG/neuralsim/blob/19b5b33113d09676bc72dca7c94b640c73d99710/app/models/scene/learnable_params.py#L140-L144

Please correct me if I'm wrong.

andrearama commented 10 months ago

Thanks for the answer @Geniussh ! Could you please point me on where in the code the poses are being wrapped as pytorch variables?

Geniussh commented 10 months ago

It's in the same learnable_params.py I linked above. And you can track it down to the lowest level which leads to nr3d_lib.models.attributes.transform.py. You can find how the forward pass is defined for the custom TransformRT class, which is what I meant by "wrap" but it's certainly an oversimplified saying.

In my past projects (and also in iNeRF), I just initialized a torch variable with length 6, and with requires_grad=True. 3 for translation and 3 for rotation. I did not impose any constraint on the rotation during training and it also worked. IMHO I guess the author defined such custom libraries here mainly in order to impose SO3 constrain during training. I haven't read too much about nr3d_lib's code so maybe I'm wrong. Take it with a grain of salt

andrearama commented 10 months ago

Thanks for the prompt reply and availability @Geniussh . Really appreciated! I guess I can close this issue