OPEN-AIR-SUN / mars

MARS: An Instance-aware, Modular and Realistic Simulator for Autonomous Driving
Apache License 2.0
666 stars 63 forks source link

Usage of multiple cameras on PandaSet dataset #146

Closed j-pens closed 5 months ago

j-pens commented 6 months ago

Hi, currently I'm trying to train MARS using multiple cameras from the PandaSet dataset using the data parser from @pierremerriaux-leddartech. The images can be loaded and training starts running but there seems to be an issue with the object models as they get learned as 'transparent' during training and thus disappear after few iterations. Do you have any pointers which part of the code could be the cause for this? I'm suspecting there might be some issue with transforming the object models into the frames of the different cameras, thus leading to reconstruction errors that can only be optimised by learning a density of 0. Any help would be appreciated!

Nplace-su commented 6 months ago

What object model do you use? Do you encounter the same issue when using a single camera sequence?

j-pens commented 6 months ago

I use CarNeRF as the object model. Training with just the front camera works great

j-pens commented 6 months ago

Just to see if I understand it correctly:

From what I see, the issue should correlate to the obj_info and the camera poses not matching, thus not intersecting with the generated rays.

We have the following coordinate frames:

PandaSet annotations are given in the global/ LiDAR frame. This is also the frame that the object info from the data parser is based on, right? The camera poses from PandaSet are also given in the global/ LiDAR frame IIUC.

MARS is based on KITTI/ vkitti2 datasets and uses their coordinate frames, i.e. annotations are given in the frame of the right camera. Which frame is used for the obj_info and the camera poses in MARS/ nerf studio? Also, some flipping is done in the pandaset_dataparser.py for the camera poses. Why is this necessary?

pierremerriaux-leddartech commented 6 months ago

@j-pens , yes objects and cameras are in world coordinate after mars dataparser. The strange flipping of camera is because nerfstudio uses camera z axis pointing backward the camera.

j-pens commented 6 months ago

@pierremerriaux-leddartech thanks for the reply! From what I see we could just transform the camera orientation by rotating for 180 degrees around the x-axis. This would simplify the code a lot. Was there a reason for doing it the way it is done in the current version of the code?

pierremerriaux-leddartech commented 6 months ago

I are right, I was old code

JiantengChen commented 5 months ago

Hi! @j-pens, I am closing this issue for inactivity, but feel free to reopen :)