Depth Information - Githubissues

Jael8300 commented 1 month ago

Hello!

Thank you so much for your great work. I was trying to replicate your work and was wondering what is the depth information that you used as depth40 and depth160 in your dataset? From my understanding, desdf.npy is obtained from the floorplan (aka map.png in your case). Is depth 40 and 160 obtained from running the ray_cast function in utils.utils.py for each rgb image by spllitting the images into 40 and 160 columns?

I would appreciate the clarification.

felix-ch commented 1 month ago

Hi,

You are correct, the desdf.npy is indeed obtained from the floorplan through raycasting.
You are correct that depth40 and 160 are also obtained from the floorplan through raycasting using the provided ray_cast function. To further clarify, for each RGB image, we first get the SE2 camera pose in the floorplan. Then depending on using 40 or 160, we divide the RGB into 40 or 160 columns uniformly and calculate the the viewing angle of each column, i.e, angles from fov/2 to -fov/2. (Note: they are not uniform in angles, we use the midpoint of the columns to calculate). Finally, through raycasting along each direction in the floorplan, we can get the depths correspond to each image columns.

Jael8300 commented 1 month ago

Thank you so much for your time and explanation of the depth40/160 files. I've got two other questions regarding the desdf.npy files.

Firstly, how did you obtain the desdf.npy files? I'm running desdf.npy with the gibson_f spencerville dataset and the desdf.npy file seems to be different from the desdf.npy file in the dataset.

Secondly, why does eval_observation.py requires desdf.npy data for my own dataset, however when running gibson_f with eval_observation.py, it didn't seem to require desdf.npy files?

Thank you once again for the time and effort to reply!

felix-ch commented 1 week ago

Hi,

It is crucial to set the parameters of the raycast_desdf function correctly, especially the _originalresolution and the resolution parameter. If you generate from the released map.png which is in 0.01m resolution, then they should be 0.01 and 0.1 respectively. When we generate the DESDFs, we first downsize the map to half, i.e., to 0.02m resolution, to make the raycasting faster without losing much precision, then run the raycast_desdf with _originalresolution = 0.02, resolution = 0.1. Furthermore, we set the max_dist = 10.
The DESDF data is needed to localize the camera, since the predicted rays are compared against them. That's why you need DESDF for evaluation, and eval_observation.py does require desdf.npy, as you can see here https://github.com/felix-ch/f3loc/blob/f28e3307d829d06cfddcce94460710cd3600613b/eval_observation.py#L250-253

felix-ch / f3loc

Depth Information #4