ethz-asl / virus_nerf

13 stars 1 forks source link

Sample dataset + custom dataset description for correct depth supervision #1

Open onurbagoren opened 1 month ago

onurbagoren commented 1 month ago

Hello!

The work and results from the paper are very impressive! Are there any example datasets that you would be able to share for running experiments?

I see from the config files that there are ones named ETHZDataset or the "robot at home" datasets, and I was curious if these could be made available.

Additionally, if a custom sensor rig is being used, where can the transforms between the sensors be edited, if that's an option?

Thank you very much for all your work and help :)

ceinem commented 1 month ago

Hi! Thanks for your interest in our work. The robot@home dataset is publicly available, how ever it doesn't actually include all the sensors we used in the paper, it mainly only includes RGBD cameras and was thus only used at an early time in the development and was not mentioned or used in our paper. Regarding the second dataset, the ETHZDataset, we recorded this ourselves in our lab. Currently we can not share it for privacy reasons, as it might include the faces of other people or other confidential information. I will let you know should we be able to make it available at some point in the future. Regarding the sensor rig and its calibration @nas-git-nas can probably tell you more.

Thanks!

onurbagoren commented 1 month ago

Ah I see -- Thank you for the prompt response!

Is it possible to get an example of a rosbag that does not have images? I can see the message types and how they get synced from the scripts, I just want to see a visualization of a rosbag that's as simple as the poses + USS/IRS readings along with the tf tree structure associated with it.

Thank you for your help :)

onurbagoren commented 1 month ago

Additionally, what camera coordinate convention is used? Is it the standard ROS coordinate frame?

nas-git-nas commented 1 month ago

Thank you for your interest!

Sample Dataset As already mentioned, we are not allowed to publish the entire ETHZ dataset. However, I just uploaded 1s of the ETHZ-office dataset to _virus_nerf/sampledataset. This should clarify the structure of the dataset and it can be used to test if the code is working properly.

Record Dataset If you use rosbags to record your own data, you can use the given scripts for pre-processing (see README Dataset Collection). The different sensors get synchronized during pre-processing by associating the closest datapoints in time (see _ETHZ_experiments/catkin_ws/sensors/src/data_tools/mainsyncData.py).

Transforms of the sensors The intrinsic camera parameters should be saved in the csv-file called _cameraintrinsics.CSV in the dataset directory (see _virus_nerf/sample_dataset/cameraintrinsics.CSV). The extrinsic camera parameters are given implicitly through the poses (see _virus_nerf/sampledataset/office/poses). The poses are saved in look-up tables in the reference frame of the ground truth map. These look-up tables are created during pre-processing by the script _ETHZ_experiments/catkin_ws/sensors/src/data_tools/maincreatePoseLookupTables.py. The transforms can be defined in _ETHZ_experiments/catkin_ws/sensors/src/pcl_tools/pclcoordinator.py.

Camera Coordinate Convention Yes, the standard ROS coordinate frame is used for the cameras. For the collection of the ETHZ-dataset the cameras were mounted upside down and therefore, the images are flipped on the head. However, if you use a different setup, you only need to define the poses correctly i.e. the transform from the cameras/LiDAR to the ground truth map (see _virus_nerf/sampledataset/office/poses).

onurbagoren commented 1 month ago

This is really helpful, thank you!

onurbagoren commented 1 month ago

I also had an additional question: for situations where the USS and the IRS are not aligned with the camera, where would these extrinsics be used? I see from the setup on the paper that the sensors are aligned with the camera and the projections are in the direction of the camera. From what I can tell, the depth from the USS and IRS are projected similarly to how the depth image is, but don't see where a good place to change the place in which these sensors extrinsics could be input

nas-git-nas commented 1 month ago

We project the measurements of the USS/IRS to the colour and depth images of the camera assuming the sensors are overlapping and pointing in the same direction. Hence, during training the same pixels/samples can be used to calculate a loss related to the colour and depth images and the USS/IRS measurements using only one forward-pass. If the USS/IRS has a relatively small orientation error concerning the camera, then you could add an orientation offset as it is done by the parameter _ToF/sensor_calibrationerror for the IRS. If the orientation/displacement error is larger, you need to adapt the training process such that the poses for the camera and the USS/IRS are distinct and the colour and depth pixels from the camera are sampled and passed forward separately to the USS/IRS measurements. Then, you can calculate the losses independently.