Preprocessing depth input

MohammadJohari commented 1 year ago

Hi @alexklwong,

I want to test your pretrained model (VOID) on a different dataset (ScanNet). However, I do not get satisfying results. I understand that RGB images should be scaled between 0 and 1. But I am not sure if depth maps require any preprocessing. Should I normalize them? Inverse them? My current depth data is the actual depth in meters.

alexklwong commented 1 year ago

Hi, in short, it depends on the format of your depth maps.

Typically, depth maps are stored as PNGs as in accordance with the VOID and KITTI datasets. To preserve significant digits, depth is multiplied by some factor before saving so that loss is minimal (order of thousandths). Loading them also require dividing by the same factor. In the case of VOID and KITTI, the factor is 256, SceneNet is 100, and Virtual KITTI is 100 . I am not sure of the factor in ScanNet, but you when you load using data_utils.load_depth(...) the multiplier is by default 256 and should be set accordingly.

In the case, where ScanNet is significantly different in scenes (possible, it looks like it is mainly indoor rooms, whereas VOID is outdoors and indoors), then it is a domain gap issue.

edhyah commented 8 months ago

@MohammadJohari did you end up getting this to work for ScanNet? I'm trying to use it on a custom dataset myself, mostly indoors.

alexklwong / mondi-python

Preprocessing depth input #3