Wanggcong / StyleLight

[ECCV 2022] StyleLight: HDR Panorama Generation for Lighting Estimation and Editing
MIT License
124 stars 8 forks source link

Evaluation give different score from the paper #9

Open pureexe opened 10 months ago

pureexe commented 10 months ago

I would like to thank you for creating such a great work.

I'm currently working on light estimation and want to compare it with the Stylelight.

However, when I ran your code on the Laval indoor dataset, the score is quite different. So, I would like to ask if this is still acceptable. If not, I am looking for the correct way to run an evaluation.

Report in paper: M (Mirror ball) Run by myself: M (Mirror ball)
Angular Error 4.30 6.74
RMSE 0.56 0.58
si-RMSE 0.55 0.56
Report in paper: S (Silver matte ball) Run by myself: S (Silver matte)
Angular Error 2.96 4.61
RMSE 0.30 0.33
si-RMSE 0.29 0.32
Report in paper: D (Diffuse ball) Run by myself: D (Diffuse ball)
Angular Error 2.41 4.07
RMSE 0.15 0.16
si-RMSE 0.11 0.13

Let me specify how I got this score so you can point out which step I did wrong.

  1. Prepare the cropped Laval indoor dataset by running
  2. changing input path by pointing root_path in to the corrected path
  3. Tonemap the ground truth in the directory name test using evaluation/
  4. Tonemap the output of Stylelight using evaluation/
  5. Render ground truth into 3 balls using evaluation/
  6. Render the output of Stylelight into 3 balls using evaluation/
  7. Compute the score by run evaluation/ --fake <stylelight's tone-mapped dir> --real <ground truth's tone-mapped dir>

Best regard.

Wanggcong commented 10 months ago

Sorry, I am very busy these days. I am afraid that I have no time to run the code and check the differences. One thing I think you might miss is that I removed the black regions and warped the panoramas to some extent such that the object is placed at a new center. This is similar to the method Gardner et al. Learning to Predict Indoor Illumination from a Single Image, SIGGRAPH Asia, 2017..

You might keep the same setting for fair comparison when compared with other methods, since I found lots of hyperparameters would affect the final results.