Open Deng-King opened 1 month ago
I got a good result by running test_vit.sh
. I finally figured out that the second row of the visualization output is the Depth Map output, and the third row is GT. Maybe they were captured by Lidar, so it looks relatively sparse.
But now I cannot seem to get a depth map by using known camera intrinsics (like running test_kitti.sh
and test_nyu.sh
). I added the test code below line 263 in de_test.py
...
pred_depths, outputs = get_prediction(
model=model,
input=torch.stack(rgb_inputs), # Stack inputs for batch processing
cam_model=None,
pad_info=pads,
scale_info=None,
gt_depth=None,
normalize_scale=None,
)
print(' -- pred --') # line 263
print(pred_depths.shape)
print(pred_depths.max())
print(pred_depths.mean())
print(pred_depths.max())
for j, gt_depth in enumerate(gt_depths):
normal_out = None
...
and the cmd output is:
[09/26 16:25:10 root]: Distributed training: False
[09/26 16:25:15 root]: Loading weight '/media/deng/Data/Metric3D/weight/metric_depth_vit_large_800k.pth'
[09/26 16:25:15 root]: Loading weight '/media/deng/Data/Metric3D/weight/metric_depth_vit_large_800k.pth'
[09/26 16:25:16 root]: Successfully loaded weight: '/media/deng/Data/Metric3D/weight/metric_depth_vit_large_800k.pth'
[09/26 16:25:16 root]: Successfully loaded weight: '/media/deng/Data/Metric3D/weight/metric_depth_vit_large_800k.pth'
0%| | 0/3 [00:00<?, ?it/s]data/nyu_demo/rgb/rgb_00000.jpg
-- pred --
torch.Size([1, 1, 480, 1216])
tensor(24.2716, device='cuda:0')
tensor(24.2192, device='cuda:0')
tensor(24.2716, device='cuda:0')
/media/deng/Data/Metric3D/mono/utils/do_test.py:322: RankWarning: Polyfit may be poorly conditioned
pred_global, _ = align_scale_shift(pred_depth, gt_depth)
33%|██████████████████████████████████████████████████████████▋ | 1/3 [00:01<00:02, 1.31s/it]data/nyu_demo/rgb/rgb_00050.jpg
-- pred --
torch.Size([1, 1, 480, 1216])
tensor(24.2716, device='cuda:0')
tensor(24.2192, device='cuda:0')
tensor(24.2716, device='cuda:0')
/media/deng/Data/Metric3D/mono/utils/do_test.py:322: RankWarning: Polyfit may be poorly conditioned
pred_global, _ = align_scale_shift(pred_depth, gt_depth)
67%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 2/3 [00:01<00:00, 1.62it/s]data/nyu_demo/rgb/rgb_00100.jpg
-- pred --
torch.Size([1, 1, 480, 1216])
tensor(24.2716, device='cuda:0')
tensor(24.2196, device='cuda:0')
tensor(24.2716, device='cuda:0')
/media/deng/Data/Metric3D/mono/utils/do_test.py:322: RankWarning: Polyfit may be poorly conditioned
pred_global, _ = align_scale_shift(pred_depth, gt_depth)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 1.90it/s]
missing gt_depth, only save visualizations...
which means the model gives a wrong prediction but IDK why 😭
and the visual is still the same
Hi there, thank you for your contributions.
When I ran the demo code following the
readme.md
tutorial (three images in the folder./data/kitti_demo/
of this repository) but I only got a relatively sparse depth prediction, how do I get dense depth estimates like it shows in the project page?What I just got:
What I'm expecting: