facebookresearch / CODD

CODD ("Temporally Consistent Online Depth Estimation in Dynamic Scenes"), WACV 2023.
Other
64 stars 3 forks source link

How to get the results like the paper? #4

Open AppleAndBanana opened 1 year ago

AppleAndBanana commented 1 year ago

Interesting work! But when I setup the code environment following README and run inference use vkitti left/right data with kitti_depth.pth, I get a strange result named '00000.disp.pred.npz' with shape [1, 10, 375, 1242], like this: 企业微信截图_1a41e4d9-1977-4598-8421-e47682bb635b 企业微信截图_4833560d-70a1-496c-8fc5-ac619851f0e1 ([0, 6, 375, 1242])

It seems that I get a wrong disparity map, could you please give me some advices?

mli0603 commented 1 year ago

Hi @AppleAndBanana

Thanks for your interest in the work. This indeed looks much worse than what I have seen. Which checkpoint did you use?

AppleAndBanana commented 1 year ago

I use 'kitti_depth.pth' to run the inference.py, with these input images: my_data.zip

My cmd is: python3 inference.py \ configs/inference_config.py \ checkpoints/kitti_depth.pth \ --img-dir my_data/30-deg-left/ \ --r-img-dir my_data/30-deg-right/ \ --num-frames 40 --show --gpus 1

and then I get a .npz file: 00000.disp.pred.npz.zip

Finally, I use follow codes to get the disparity below: import numpy as np import cv2 a = np.load('00000.disp.pred.npz')['disp'] b = a[0, 6] #or a[0,0], a[0,1], ... c = (b-b.min()) / (b.max()-b.min()+1e-6) * 255 cv2.imwrite('disp.png', c.astype(np.uint8))

mli0603 commented 1 year ago

Thank you @AppleAndBanana

I will take a look and let you know.

mli0603 commented 1 year ago

Hi @AppleAndBanana

Looking at your data, it is very clear that the images are not rectified, which is the basic assumption most stereo depth algorithms made. Rectification means the correspondences between left and right images lie on the same line.

Here is a visualization of your data, plotting the left and right images together. I also plotted horizontal lines on your left and right images. You can see that the images are not rectified as expected (such as the root of the tree). image

AppleAndBanana commented 1 year ago

Hi @AppleAndBanana

Looking at your data, it is very clear that the images are not rectified, which is the basic assumption most stereo depth algorithms made. Rectification means the correspondences between left and right images lie on the same line.

Here is a visualization of your data, plotting the left and right images together. I also plotted horizontal lines on your left and right images. You can see that the images are not rectified as expected (such as the root of the tree). image

Oh I see, thanks for your reply, I will try this again with rectification!