facebookresearch / localrf

An algorithm for reconstructing the radiance field of a large-scale scene from a single casually captured video.
MIT License
976 stars 60 forks source link

[Question] Results on tanks and temples #35

Closed sevashasla closed 1 year ago

sevashasla commented 1 year ago

Hello! Thank you for the great work! Could you please share a table with the results on tanks and temples? I mean for each of the 9 scenes.

ameuleman commented 1 year ago
Hi, here are the per-scene metrics: Method Scene PSNR SSIM LPIPS
Nerfacto M60 23.51 0.755 0.349
Panther 25.88 0.796 0.344
Train 15.95 0.598 0.406
Auditorium 20.91 0.791 0.388
Ballroom 11.21 0.315 0.897
Courtroom 17.63 0.592 0.498
Museum 17.03 0.627 0.387
Caterpillar 17.31 0.521 0.512
Church 18.51 0.653 0.410
Mip-NeRF360 M60 26.08 0.823 0.287
Panther 27.13 0.839 0.282
Train 21.02 0.660 0.379
Auditorium 29.93 0.858 0.338
Ballroom 14.02 0.372 0.803
Courtroom 23.63 0.694 0.436
Museum 24.55 0.704 0.340
Caterpillar 22.42 0.598 0.446
Church 24.65 0.740 0.339
NeRF++ M60 19.91 0.621 0.622
Panther 20.87 0.633 0.633
Train 19.03 0.537 0.608
Auditorium 24.96 0.765 0.582
Ballroom 15.33 0.371 0.883
Courtroom 20.97 0.579 0.680
Museum 20.52 0.523 0.682
Caterpillar 20.12 0.437 0.706
Church 20.53 0.530 0.741
Mega-NeRF M60 17.77 0.613 0.522
Panther 18.16 0.614 0.515
Train 16.03 0.465 0.749
Auditorium 23.54 0.767 0.435
Ballroom 15.40 0.349 0.871
Courtroom 20.52 0.592 0.602
Museum 18.31 0.490 0.738
Caterpillar 18.50 0.430 0.626
Church 19.25 0.538 0.536
LocalRF (COLMAP) M60 21.88 0.688 0.541
Panther 23.20 0.708 0.526
Train 21.89 0.663 0.395
Auditorium 27.27 0.819 0.445
Ballroom 18.79 0.529 0.566
Courtroom 23.01 0.672 0.468
Museum 22.88 0.621 0.508
Caterpillar 21.87 0.553 0.543
Church 23.69 0.671 0.471
SCNeRF (COLMAP init) M60 14.78 0.545 0.652
Panther 16.10 0.556 0.652
Train 15.73 0.473 0.648
Auditorium 21.43 0.728 0.588
Ballroom 11.30 0.304 0.928
Courtroom 19.43 0.551 0.700
Museum 17.87 0.469 0.695
Caterpillar 16.51 0.374 0.753
Church 17.87 0.487 0.752
BARF (COLMAP init) M60 8.05 0.431 0.858
Panther 9.38 0.451 0.852
Train 9.01 0.356 0.852
Auditorium 13.01 0.580 0.813
Ballroom 9.49 0.255 0.931
Courtroom 9.46 0.354 0.925
Museum 10.79 0.353 0.936
Caterpillar 8.55 0.255 0.945
Church 10.68 0.376 0.925
LocalRF (COLMAP init) M60 22.93 0.709 0.504
Panther 24.22 0.727 0.497
Train 22.89 0.703 0.349
Auditorium 27.42 0.818 0.447
Ballroom 19.97 0.567 0.499
Courtroom 23.27 0.675 0.454
Museum 22.34 0.605 0.532
Caterpillar 21.98 0.559 0.528
Church 24.03 0.678 0.461
BARF M60 10.74 0.498 0.835
Panther 11.69 0.507 0.845
Train 11.46 0.405 0.817
Auditorium 15.79 0.645 0.811
Ballroom 10.44 0.271 0.945
Courtroom 12.49 0.420 0.896
Museum 12.17 0.383 0.935
Caterpillar 11.01 0.289 0.944
Church 12.15 0.404 0.930
LocalRF M60 18.66 0.624 0.638
Panther 20.65 0.649 0.634
Train 20.03 0.580 0.540
Auditorium 24.97 0.770 0.588
Ballroom 18.98 0.490 0.622
Courtroom 21.03 0.597 0.616
Museum 19.86 0.519 0.710
Caterpillar 20.36 0.475 0.647
Church 21.90 0.593 0.618
sevashasla commented 1 year ago

Thank you very much! Could you also please share the hyperparameters that you used to run your model on the T&T dataset? Is FOV = 71 in your case? Also, is the calculation of the metrics is the same as in the paper? Why MipNeRF360 is better now?

ameuleman commented 1 year ago

Yes, we use FOV=71. How are you averaging metrics? We average PSNR in the square error domain and SSIM in sqrt(1-SSIM).

sevashasla commented 1 year ago

Actually I used just a simple averaging for every subsection of your table (I mean for Nerfacto, LocalRF (COLMAP), LocalRF (COLMAP init), ...) What do you mean by "average in domain"? Is this from paper [5]? Could you please share the formula of the averaging? Are mentioned results from the paper obtained from this table?

tb2-sy commented 1 year ago

Actually I used just a simple averaging for every subsection of your table (I mean for Nerfacto, LocalRF (COLMAP), LocalRF (COLMAP init), ...) What do you mean by "average in domain"? Is this from paper [5]? Could you please share the formula of the averaging? Are mentioned results from the paper obtained from this table?

Hello! I think you can refer to render.py and train.py, which roughly means averaging the mse error values of all images, and then converting this average into a psnr value.

ameuleman commented 1 year ago

Hi

Hello! I think you can refer to render.py and train.py, which roughly means averaging the mse error values of all images, and then converting this average into a psnr value.

Yes, we average square error values before converting to PSNR as in render.py: $\text{MSE} = \frac1n\displaystyle\sum_{i=1}^{n} \text{MSE}_i$ and $\overline{\text{PSNR}} = 10 \log_{10}\left(\frac{1}{\text{MSE}} \right)$.

For SSIM, we convert to the $\sqrt{(1-\text{SSIM})}$ domain for averaging, before converting back to SSIM: $\overline{\text{SSIM}} = 1- \left(\frac1n\displaystyle\sum_{i=1}^{n} \sqrt{(1-\text{SSIM}_i)}\right)^2$. This is not currently implemented in render.py.

sevashasla commented 1 year ago

Thanks for your reply!

tb2-sy commented 1 year ago

Hi

Hello! I think you can refer to render.py and train.py, which roughly means averaging the mse error values of all images, and then converting this average into a psnr value.

Yes, we average square error values before converting to PSNR as in render.py: MSE=1n∑i=1nMSEi and PSNR―=10log10⁡(1MSE).

For SSIM, we convert to the (1−SSIM) domain for averaging, before converting back to SSIM: SSIM―=1−(1n∑i=1n(1−SSIMi))2. This is not currently implemented in render.py.

Thanks for your reply, I would also like to ask you whether the calculation metric is calculated using the saved render image, or is calculated directly without saving it as a image. I think there will be some errors because there is a rounding operation in saving the image.