normalization or not - Githubissues

alexklwong / calibrated-backprojection-network

PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)

Other

121 stars 23 forks source link

normalization or not #27

Closed Thermaloo closed 1 year ago

Thermaloo commented 1 year ago

Hi~ Alex,

Thank you for your nice work. I succeeded in reproducing your results. However, I have some questions about whether img or depth is normalized. They don't seem to be normalized in your code, so what's the purpose of that? Does normalization yield better results? By the way, If the image resolution is too large, limited by cuda memory, could I resize image and depth instead of random crop? And during the test, the obtained low-resolution depth map will be upsampled to the original resolution size. Looking forward to your reply!

Best regards, Viot

alexklwong commented 1 year ago

Hi Viot, we do normalize the image to 0-1 range here

https://github.com/alexklwong/calibrated-backprojection-network/blob/master/src/kbnet.py#L146 https://github.com/alexklwong/calibrated-backprojection-network/blob/master/src/kbnet.py#L415-L422

but do not normalize depth. Depth completion is special because we are given metric depth as oppose to many other problems where it is scaleless. If the image resolution is too large, my choice is to do cropping, but you are welcome to resize. Note that you will need to handle the position of the sparse depth locations when you resize and you also need to update your intrinsics.

Thermaloo commented 1 year ago

Thank you for your reply! I'm sorry I ignored the normalization of image. According to your words, I know that when I use resize operation, the improper resize method will result in incorrect depth position. Could you give me some suggestions to to handle the position of the sparse depth locations?

Thermaloo commented 1 year ago

What's more, recently I tried to reproduce the result of self-supervised sparse-to-dense. However, in the training process of the self-supervised mode, it seems that the results cannot converge. I found that some guys also raised the same issue, but the author did not reply. I found this work mentioned in your awesome-state-of-depth-completion repo. Have you tried to replicate the results of his self-supervised mode? Thank you very much for your patient reply, which is very helpful to me.

alexklwong commented 1 year ago

You can directly compute the positions based on the original and resized dimensions of the image. Some has tried to use NN interpolation, but you would still lose points.

We have also encountered trouble when training sparse to depth code repository several years ago. I recall that we needed to spend a lot of time to get it to work, but I don’t remember the details of what was needed to be done.