KAIST-VCLAB / EgocentricReconstruction

EgocentricReconstruction
Other
121 stars 17 forks source link

Inconsistency in Depth #2

Closed babar41 closed 1 year ago

babar41 commented 2 years ago

First of all, thank you very much for sharing your work.

I have tried to run the code on a short 360 video captured inside a small conference room using RICOH Theta camera and then extracted the poses using OpenVSLAM (as suggested in your manuscript). I am using a resolution of 1920 x 960 for video. However, the output 3D mesh is not correct and I have noticed that the depth between consecutive frames is not consistent and varies a lot. I have also run the demo data provided in the repo and found the depth estimate for the demo data are very consistent across frames.

Could you please help comment on:

1) Any possible reason for depth being inconsistent across consecutive frames.

2) I noticed that if I decrease the sample ratio to 1 (i.e. use all the frames) the depth estimate shows more dark region, whereas when I increase the sample ratio, the depth seems to appear more even out. Is it because of depth being proportional to the amount of disparity between consecutive frames ? (please see the two depth outputs from the same frame of the video sequence when processed under two different sample ratios egocentric_depth_of_same_frame

Hyeonjoong-Jang commented 2 years ago

Hi babar41,

Sample-ratio is used for sampling frames from the input video and making a shorter video. Therefore, the neighbor frames used for depth estimation for the same reference frame will be changed when the sample-ratio is changed.

For the first question, our depth estimation pipeline is based on RAFT. If it shows low temporal consistency, it may be because the scene contains many reflective surfaces.

For the second question, the depth map from 1 sample ratio shows dark regions on almost half of the image. This usually happens when a set of neighbor frames used for depth estimation are on a straight line. It is not a problem because the depth values of those points will be estimated from other reference frames. I think it will help if you move and rotate the camera more to make a less simple camera trajectory.