naver / kapture

kapture is a file format as well as a set of tools for manipulating datasets, and in particular Visual Localization and Structure from Motion data.
BSD 3-Clause "New" or "Revised" License
475 stars 66 forks source link

How you generate the depthmaps in GangnamStation? #47

Closed shuimushan closed 1 year ago

shuimushan commented 1 year ago

Hi, I wonder how the depth maps in GangnamStation and HyundaiDepartmentStore datasets are generated. In the trajectories-txt, you mentioned images and depth maps come from a single sensor. So are the images and depth maps collected from RGBD sensors? If not, how the intrinsic params for depth sensors in sensors.txt are got?

mhumenbe commented 1 year ago

Hello!

The depth maps were generated from the lidar scans. We accumulated lidar scans in a small window around the image timestamps and then reprojected the 3D points onto the images using the camera parameters.

Images and depth maps do not need to come from a single sensor. Could you please point me to the part of the documentation that led to the confusion? The sensor_type can be, a.o., camera or depth and for both, intrinsic parameters can be specified.

Does this answer your questions?

shuimushan commented 1 year ago

I see. So the depth maps are got through post processing with lidar scans. Then how the intrinsic params for each depth sensor are acquired?

mhumenbe commented 1 year ago

We projected the lidar points onto the camera images but used a lower resolution. Thus, the depth map intrinsics are a downscaled version of the corresponding camera intrinsics.

For example: GangnamStation/B1/release/mapping/sensors

This depth sensor,

40027089_01_depth, , depth, OPENCV, 864, 682, 570.5555443175579, 577.1324649757152, 433.9799746546102, 340.2369257807652, -0.157407343123675, 0.08784706389411968, 0.0008130076548242082, 0.0007199416321939081

is a downscaled version (1/3) of this camera.

40027089_01, , camera, OPENCV, 2592, 2048, 1711.6666329526736, 1733.0898654989219, 1301.9399239638306, 1021.7085395879869, -0.157407343123675, 0.08784706389411968, 0.0008130076548242082, 0.0007199416321939081

The sensor_id shows that they correspond to each other. Distortion coefficients do not change as they are independent of image resolution.

shuimushan commented 1 year ago

Thanks for your reply!