Closed zenithfang closed 4 years ago
The one which could configure the depth map as the order shown in mdataloader.kitti.
@cxlcl . could you provide the code and the results of the camera pose extraction from the raw Kitti dataset?
If you meant reading the camera pose from the raw Kitti dataset, we are using the pyKitti package to read the camera pose from the raw Kitti dataset: https://github.com/NVlabs/neuralrgbd/blob/d560bd96126cb3a8d300bc866911d93929f7932a/code/mdataloader/kitti.py#L160 https://github.com/NVlabs/neuralrgbd/blob/d560bd96126cb3a8d300bc866911d93929f7932a/code/mdataloader/kitti.py#L168
@cxlcl , thank you. I saw it. I briefly read your code and I have one more question. In file warping/homography.py, the get_rel_extrinsicM function was defined as:
def get_rel_extrinsicM(ext_ref, ext_src):
''' Get the extrinisc matrix from ref_view to src_view '''
return ext_src.dot( np.linalg.inv( ext_ref))
I don't understand why you compute the transformation matrix from ref_view to src_view. From my understanding of your paper, we need a transformation from src_view to ref_view. Then, we can compute the cost volume between the reference image and the warped image. Can you clarify this question for me?
Yes, your understanding for the cost volume is correct. But we still need the transformation from the ref. view to the src. view so we can do the 3D re-sampling for the prediction step: p(dt) -> p(d{t+1})
@cxlcl , I agree. But at line 271 of the file batchloader.py:
src_cam_pose_ = [ warp_homo.get_rel_extrinsicM(ref_dat_['extM'], src_cam_extM_) for src_cam_extM_ in src_cam_extMs ]
This command is to get the relative transformation matrix from ref_view to src_view. When I look into your code of building cost volumne after feature extraction from D-Net, you use function est_swp_volume_v4
in file homography.py, right? Basically, this function implemented this formula:
warped_src_at_depth_d = K*R*P_ref_cuda * d + K*t
But the rotation R and translation t are still from ref_view to src_view. So this function can not warp the src_image to ref_image. That's what I'm understanding about your code now. Am I missing something?
I think what you might missed is that in order to get the warped src. view to the ref. view, we should: (1) for the gird pixel locations in the ref view, calculate their corresponding locations in the src view. (2) do interpolation in the src view: https://github.com/NVlabs/neuralrgbd/blob/c8071a0bcbd4c4e7ef95c44e7de9c51353ab9764/code/warping/homography.py#L447
See the discussion in slide 12-14 here
thank you so much @cxlcl , I finally understood about it.
Could you be more specific on what/which types of pre-processing is needed ?