aim-uofa / AdelaiDepth

This repo contains the projects: 'Virtual Normal', 'DiverseDepth', and '3D Scene Shape'. They aim to solve the monocular depth estimation, 3D scene reconstruction from single image problems.
Creative Commons Zero v1.0 Universal
1.06k stars 144 forks source link

Preprocessing Depth Data and Auxiliary Branch Architecture #37

Closed PJ-cs closed 2 years ago

PJ-cs commented 2 years ago

Hello, thank you for your amazing work!

I am trying to reproduce your training procedure. And I have two questions regarding the preprocessing and the architecture of the lightweight branch.

If I understand correctly, you trained your network with two branches, one main branch for relative depth estimation and one auxiliary branch for disparity estimation. Each with different losses depending on the used dataset.

  1. Preprocessing:

    • Did you apply any preprocessing to the relative depth sources taskonomy and 3dkenburns ?
    • How did you normalize the disparity?
    • Did you specifically handle sky regions by using segmentation masks?
  2. Auxiliary Branch:

    • It seems to be a central part of the training procedure, but you only give general information in the appendix.
    • What architecture did you use for this branch? Does it only share the weights of the first 4 layers of the decoder and adds a new last layer for the disparity estimation?

Thank you for your time and Best Regards

guangkaixu commented 2 years ago

@PJ-cs Hi, the relative depth and disparity are normalized to 0-10. Some extremely large values of taskonomy and 3dkenburns are masked before normalization. You can generate sky masks with the help of existing segmentation methods.

The training code of LeReS is released today. For more information, you can check the code for details.