ActiveVisionLab / DFNet

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching (ECCV 2022)
https://dfnet.active.vision
MIT License
94 stars 9 forks source link

Question about the histogram #4

Closed LeiJiang1 closed 1 year ago

LeiJiang1 commented 2 years ago

It seems that you use a dictionary way (torch.nn.Embedding) to represent the histogram, instead of computing on the luma channel Y of a target image in YUV space which I can only see in the dataset preparation part. If you are using torch.nn.Embedding to represent histogram (also can be referred to as appearance code), it will be almost the same with the nerfw. I am quite confused about which way you actually used to represent the histogram or probably where I got missing. I really appreciate it if you can answer the question. Many thanks in advance.

chenusc11 commented 2 years ago

Hi, @LeiJiang1 thanks for the question.

The histogram-assisted NeRF is written based on my implementation of nerfw. So in a way, it is similar to nerfw in terms of using latent code as conditional input.

The main difference is, however, nerfw was trained based on the frame index of the training set. Thus at the inference time, the nerf can only render a view similar to the specified training image in terms of luminance and exposure.

In our case, we can use the histogram extracted from the test image to render the view. This better suits the task of camera relocalization, since the testing sequence may not be captured at the same time of the training sequence

LeiJiang1 commented 2 years ago

Dear Chen, first many thanks for your reply. I understand the difference between your method and nerfw and I fully understand the code of nerfw using frame index. However, I cannot see anywhere you pass the histogram extracted from the test image to the nerf model, which confused me. Can you point out where you pass the histogram extracted from the image to the nerf model?

chenusc11 commented 1 year ago

Hi, I see. Sorry for the confusion.

The histogram is passed here using img_idx https://github.com/ActiveVisionLab/DFNet/blob/1389760f770851a77e601af1312f19fe065bd185/script/run_nerf.py#L51

When inference the NeRF, we embedded the histogram to latent code here https://github.com/ActiveVisionLab/DFNet/blob/1389760f770851a77e601af1312f19fe065bd185/script/models/nerfw.py#L68-L72

LeiJiang1 commented 1 year ago

Oh, I finally got it. Thanks so much for your patient reply. I really appreciate it.