zhyever / PatchRefiner

[ECCV 2024] Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation
MIT License
34 stars 5 forks source link

Weird results using the CityScape checkpoint + ZoeDepth #6

Open haiphamcse opened 1 week ago

haiphamcse commented 1 week ago

Hi there, loved your work! I wanted to ask about your code (especially the CityScaped fine-tuned version) and how to reproduce the qualitative results in the paper (fig.6). I downloaded the checkpoint and run the following script python ./tools/test.py configs/patchrefiner_zoedepth/pr_cs.py --ckp-path ../data/patchrefiner/work_dir/zoedepth/cs/pr/checkpoint_05.pth --cai-mode r32 --cfg-option general_dataloader.dataset.rgb_image_dir='./examples/' --save --work-dir ./work_dir/predictions_ --test-type general --image-raw-shape 1080 1920 --patch-split-num 2 2 I used the pretrain_coarse_model='../data/patchrefiner/work_dir/zoedepth/cs/coarse_pretrain/checkpoint_05.pth', in the pr_cs.py file. However, when inferencing I noticed the detail is substantially different from the paper. To do a sanity check, I also run with the pr_u4k.py config and it looks good. So I wonder what is the problem with the CS checkpoint? Thank you for helping me.

image

zhyever commented 2 days ago

Hi, thanks for your interest in our work. For the cityscapes model, it was specifically trained on the cityscapes dataset (outdoor, driving). There would be some gaps when adopting it on images from other domains like ones you are using (indoor).

Another issue is about the rendered depth map itself. The default rendering pipeline will first normalize the depth map and then render the depth map. The scale that is highlighted in our paper will be eliminated during this process. For example, if you are using a cityscape image, the cs model can give you a predicted depth in the range of 0-250m correclty, whereas the prediction of the u4k model would be merely 0-10m. However, it would be hard to direclty visualize the differece between two depth maps with the current method: color_pred = colorize(result, cmap='Spectral', vminp=0, vmaxp=100). Please change this line of code to: color_pred = colorize(result, cmap='Spectral', vmin=0, vmax=250).