ActiveVisionLab / DFNet

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching (ECCV 2022)
https://dfnet.active.vision
MIT License
95 stars 9 forks source link

Question about pose and fine-sample #2

Closed LZL-CS closed 2 years ago

LZL-CS commented 2 years ago

Hi, I am confused about the poses and fine sample, which details as blew:

Q1: I know that poses_homo are camera-to-world poses (homogeneous: N 4 4), while pose_avg_homo are averaged poses (which means the centre of all poses, as well as homogeneous: N 4 4). But I am confused about why have to left-multiply as np.linalg.inv(pose_avg_homo) @ poses_homo, and how can we derivate this formula? https://github.com/ActiveVisionLab/DFNet/blob/45880b7e6230aa278ea0da7a33110bbf396de71c/dataset_loaders/load_7Scenes.py#L194

Q2: How to understand this fine-sample function (or how can we derivate this formula)? https://github.com/ActiveVisionLab/DFNet/blob/45880b7e6230aa278ea0da7a33110bbf396de71c/script/models/rendering.py#L63

Hope for your response, thank you!

chenusc11 commented 2 years ago

Hi, both Q1 and Q2 are the same implementations of the original NeRF paper, which I have not made a change.

Please see here: https://github.com/bmild/nerf/blob/18b8aebda6700ed659cb27a0c348b737a5f6ab60/load_llff.py#L166 and https://github.com/bmild/nerf/blob/18b8aebda6700ed659cb27a0c348b737a5f6ab60/run_nerf_helpers.py#L183

  1. The purpose of 1st part is to shift the GT camera poses to the center at (0,0,0). I suppose it is done in a way in which c2w@P_centered = P_original

  2. The 2nd part is a little bit hard to explain. It is essentially inverting the CDF by finding pdf positions from the CDF. At a high-level concept, it is trying to generate more sample pts near the object surface area (refer to section 5.2 of NeRF paper).

LZL-CS commented 2 years ago

Hi @chenusc11, thanks for your reply, I will digest the references you provide.