aim-uofa / AdelaiDepth

This repo contains the projects: 'Virtual Normal', 'DiverseDepth', and '3D Scene Shape'. They aim to solve the monocular depth estimation, 3D scene reconstruction from single image problems.
Creative Commons Zero v1.0 Universal
1.06k stars 144 forks source link

Questions about the training resolution and auxiliary network. #77

Open whyygug opened 11 months ago

whyygug commented 11 months ago

Your work is really excellent and I am having some issues with your code. Can you provide me with some guidance? Thank you very much!

  1. If I want to train at a lower resolution (like 256x256), which arguments should I change, is it just changing '__C.DATASET.CROP_SIZE = (448, 448)' to '__C.DATASET.CROP_SIZE = (256, 256)'? Also, should the settings of the normal loss function be changed with the resolution? (e.g., should the sampling distance or number of sampled points also be appropriately lowered? If so, can you give me some advice?Many Thanks!!)

  2. You use an auxiliary training network to output disparity, can you tell me if the auxiliary structure significantly improves performance, or it only provides slight improvement? Is the auxiliary structure necessary for high-quality depth prediction? I'm trying to replace your main model structure with monodepth2's ResNet18 based network that outputs disparity, so I can't use a redundant auxiliary disparity prediction network for training, and I'm concerned that discarding the auxiliary structure will significantly impact performance.