yikaiw / CEN

[TPAMI 2023, NeurIPS 2020] Code release for "Deep Multimodal Fusion by Channel Exchanging"
MIT License
284 stars 43 forks source link

Formatting iOS Lidar Depth Data For Transfer-Learning #11

Closed elimtailor closed 2 years ago

elimtailor commented 2 years ago

We have 2D depth data corresponding to an RGB image, it has values 0.0-5.0 that represent the distance in meters from the sensor to that object in a straight line.

We want to transfer learn (on your pretrained weights) over our dataset, and it seems that we should save our depth data as pngs, since that's the format of them in the dataset you link to. What preprocessing should we run, if any, for our depth data to be in the correct format - maybe just scale 0-5 to 0-255 and save as a grayscale png?

(By the way, in our data 0.0 is the default value when something is too far away or no signal is returned. It seems like this is the same for the kinect depth data because the black patches are probably 0 values.)

And I don't think the paper mentions any preprocessing done on depth, but the utils/datasets.py file does have this: if key == 'depth': img = cv2.applyColorMap(cv2.convertScaleAbs(255 - img, alpha=1), cv2.COLORMAP_JET)

What is this doing?

Thanks in advance! Eli

yikaiw commented 2 years ago

Hi, during our experiments, we find the model is robust to the normalization of depths. You can simply rescale depths to 0.0\~1.0. Considering your depth values are 0.0~5.0, you can set NORMALISE_PARAMS (in config.py) Line 18 from ''1./5000'' to ''1./5''.

If something is too far away or no signal is returned, then the corresponding depth should be manually set to the largest value (in your case, 5.0, before normalization) instead of zero.

"img = cv2.applyColorMap(cv2.convertScaleAbs(255 - img, alpha=1), cv2.COLORMAP_JET)" This line is actually not adopted, since the function "readimage" (utils/datasets.py Line 85) is abandoned, and is substituted by "read_image" (utils/datasets.py Line 92).

elimtailor commented 2 years ago

Thank you!