Thank you for the work.
I would like to ask a few questions about the evaluation. I cannot see from the paper or the code how you evaluate the method on the different datasets.
I am currently using the snippet you provided in the README, namely loading the model directly from torch.hub, passing image in [0,1] range and passing the (adapted) intrinsic.
For KITTI I am resizing the image to 192x640 (with padding) and modifying accordingly the intrinsic. The RGB image is passed to the network normalized in [0,1] range. However, the results are not the same as in the paper, I am using Garg-crop evaluation with depth up to 80m, but, for instance, RMSE is better, but deltas are worse. Are you using the not-corrected, i.e., old, KITTI dataset, namely the sparse 697 images (instead of the denser 652 ones)?
For NYU I tried both to resize (and pad) to 384x640 and not resizing at all keeping shapes 480x640. However, in both cases results are pretty off (the depth maps present decent shapes but the overall scene is wrong).
I can imagine something is wrong with my evaluation. Therefore, could you please provide more details on the evaluation code/pipeline/setup for datasets like NYU and KITTI?
Thanks in advance for the clarification.
Edit:
1) I saw you are not using the center crop for KITTI eigen-split evaluation and tried to use the "direct" resizing method that you are using, namely not keeping the ratio. However, I noticed that the intrinsics that the codebase is feeding to the model are actually not the given ones, e.g., the center of projection is set as height//2 and width//2 and the focal length does not correspond to the original one.
2) I also tried to reproduce them with this codebase, but the output results are different from the ones reported in the paper for KITTI. For NYU I cannot verify them since the NYU dataset class seems to be not provided in this codebase (or the external links).
3) FYI, your external link to efm_datasets in this repo is broken.
Thank you for the work. I would like to ask a few questions about the evaluation. I cannot see from the paper or the code how you evaluate the method on the different datasets.
I am currently using the snippet you provided in the README, namely loading the model directly from torch.hub, passing image in [0,1] range and passing the (adapted) intrinsic.
I can imagine something is wrong with my evaluation. Therefore, could you please provide more details on the evaluation code/pipeline/setup for datasets like NYU and KITTI? Thanks in advance for the clarification.
Edit: 1) I saw you are not using the center crop for KITTI eigen-split evaluation and tried to use the "direct" resizing method that you are using, namely not keeping the ratio. However, I noticed that the intrinsics that the codebase is feeding to the model are actually not the given ones, e.g., the center of projection is set as
height//2
andwidth//2
and the focal length does not correspond to the original one. 2) I also tried to reproduce them with this codebase, but the output results are different from the ones reported in the paper for KITTI. For NYU I cannot verify them since the NYU dataset class seems to be not provided in this codebase (or the external links). 3) FYI, your external link toefm_datasets
in this repo is broken.