poetrywanderer / CF-NeRF

[ECCV 2022] Official code release of Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification
MIT License
35 stars 3 forks source link

question about AUSE metric. #3

Open lilygoli opened 1 year ago

lilygoli commented 1 year ago

Hi,

In the code, in line the errors are not normalized (commented out). I was wondering why this is and if the metrics reported in the paper were reported without normalization.

Thank you!

poetrywanderer commented 1 year ago

Hi,

We know that AUSE computes the correlation between the uncertainty estimates and the predictive errors. The AUSE results shown in the tables of the paper were reported without normalization. The reason beyond this is that we believe that AUSE without normalization can not only measure the quality of uncertainty estimation, but also reveal the prediction quality of the reconstructed images. And we believe the latter is equally important and could make the comparison more comprehensive and reasonable.

Actually, as their exact AUSE values are too small, for better illustration, we simply divide them by the same max predictive error of the errors from these methods. This is exactly what we show in Table 2 and Table 3 of the paper.

In addition, in the Supplementary Materials, we also show the comparing results for AUSE with normalization in Fig.6

Best, S.

lilygoli commented 1 year ago

Thanks for the the reply. I didn't quite understand what you meant by this:

Actually, as their exact AUSE values are too small, for better illustration, we simply divide them by the same max predictive error of the errors from these methods. This is exactly what we show in Table 2 and Table 3 of the paper.

Do you mean you divide all errors by a fixed number? More precisely, is the code provided for the sparsification_plot method here, as it currently is, the code used for the Tables 2 and 3? If not, what are the changes?

Also, for depth AUSE, why are the metrics reported on disparity rather than depth? Specially for LF dataset with training cameras close to each other, isn't the error of predicted depth much higher for far away points? Finding the error in inverse depth has the effect of ignoring this high error (for example assume for point 1 pred_depth=1 and gt_depth=2 and for point 2 pred_depth=1000 and gt_depth= 100, then clearly for depth error_1 < error_2. However if you do the comparison in inverse depth it would appear that point 1 has more error.)

Thanks!

RPFey commented 11 months ago

@lilygoli Hi,Can you reproduce the results on Table 2 ?

lilygoli commented 11 months ago

Yeah I think I was able to get close AUSE RMSE for LF dataset as reported on disparity. However you might need to retrain a few times.

mezzelfo commented 11 months ago

@lilygoli Hi! I'm trying to reproduce the results on Table 2. I can't reproduce even the Quality Metric PSNR on Africa scene (reported as 23.84db). I tried to retrain a few times but I get always the same results. Can you share the model that reproduces the results? Exactly how did you obtained it?