andrewsonga / Total-Recon

[ICCV 2023] Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis
https://andrewsonga.github.io/totalrecon
Other
213 stars 6 forks source link

Custom LPIPs Implementation #7

Closed coltonstearns closed 1 month ago

coltonstearns commented 1 month ago

Hello, I was wondering if you could verify that your custom LPIPs implementation in you evaluation matches that of popular libraries such as the "lpips" library (https://pypi.org/project/lpips/) or the torchmetrics library (https://lightning.ai/docs/torchmetrics/stable/image/learned_perceptual_image_patch_similarity.html). I would like to baseline with your method, but I am currently unsure I will need to port your LPIPs evaluation because it is doing something different then these standard libraries?

andrewsonga commented 1 month ago

Hello Colton,

Thank you for your interest in Total-Recon and raising this issue.

I have checked and there does indeed seem to be a slight discrepancy between Total-Recon's and the latest implementations of LPIPS from other libraries (Zhang's and torchmetrics). I made a jupyter notebook (https://github.com/andrewsonga/Total-Recon/blob/main/lpips_test.ipynb) that carries out the comparisons.

The discrepancy stems from the following: Total-Recon's lpips implementation was taken from the NSFF repo, which in turn was taken from the original LPIPS library written by Zhang et. al (according to this github issue: https://github.com/zhengqili/Neural-Scene-Flow-Fields/issues/6). The NSFF repo took the version of the original LPIPS library written before this commit on September 5th, 2020 (https://github.com/richzhang/PerceptualSimilarity/commit/c33f89e9f46522a584cf41d8880eb0afa982708b), which carried out a large refactoring of the codebase. It is highly likely that the changes made during this commit and all subsequent commits are the reason for the discrepancy shown above.

So I think the best way to baseline our method is to use the latest LPIPS implementation (whether that be Zhang et. al's or torchmetrics'). I would like to maintain Total-Recon's current implementation to match the metrics reported in the paper.

If you run into any issues while baselining Total-Recon (LPIPS-related or otherwise), please let me know :)

Best, Andrew

coltonstearns commented 1 month ago

Ok great, thank you for this comparison! Even though slightly different, they seem to be roughly the same :)