Difference between IBRnetwithNeuray model and Neuray model

liuyuan-pal / NeuRay

[CVPR2022] Neural Rays for Occlusion-aware Image-based Rendering

GNU General Public License v3.0

408 stars 31 forks source link

Difference between IBRnetwithNeuray model and Neuray model #13

Closed wenzhengchen closed 2 years ago

wenzhengchen commented 2 years ago

Hi, thanks for sharing the code for this amazing work!

If I understand correctly, IBRnet didn't consider the visibility in each view while neuray considers it with the help from the depth map. In the implementation, I saw there is a class called IBRNetWithNeuRay https://github.com/liuyuan-pal/NeuRay/blob/a877129a76dc7ef6527254e7e6e84ff808f6322f/network/ibrnet.py#L239. I wonder in terms of implementation, would this model will have the same performance as the neuray model itself(NeuralRayGenRenderer)? https://github.com/liuyuan-pal/NeuRay/blob/a877129a76dc7ef6527254e7e6e84ff808f6322f/network/renderer.py#L256
Or, neuray model actually has other designs which make it even better?

Thank you!

Best, Wenzheng

liuyuan-pal commented 2 years ago

Hi, thanks! IBRNetWithNeuRay is exactly the same as IBRNet but with additional visibility terms. However, NeuralRayGenRenderer without using visibility will be slightly worse than the original IBRNet because the image encoder is relatively smaller than the one used in IBRNet. We use a smaller one due to memory limitations in training. https://github.com/liuyuan-pal/NeuRay/blob/a877129a76dc7ef6527254e7e6e84ff808f6322f/network/renderer.py#L58

wenzhengchen commented 2 years ago

Thanks for the answer!

So basically, in terms of performance, Neuray (no visibility) < IBRnet (no visibility), due to smaller MLP < Neuray(with visibility), right?

I also have another question, would IBRnet (with visibility) be better Neuray(with visibility)? Or do they have similar performances? I guess the former may indicate larger MLP still help training while the latter means the visibility is more important.

liuyuan-pal commented 2 years ago

Hi, the image encoder is actually a CNN (not an MLP) that is in charge of extracting image features for feature aggregation (matching). In this case, using a larger CNN brings stronger image features to find more accurate surfaces (density). In general, NeuRay model=IBRNet model + visibility. We encode the visibility in feature vectors associated with rays so we call the model Neural Rays; then, we apply such visibility in an IBRNet model to predict density and colors.