Evaluating the pretrained model

tonetechnician commented 1 year ago

Hey there @Wanggcong

Thanks so much for making this repo available. It has been very interesting to explore!

I've been evaluating your pretrained model on some of the Laval dataset - I'm noticing in the paper you get quite close to ground truth results.

Whilst evaluating the model + code, I've been noticing that I'm struggling quite a lot to get close to ground truth. I've attempted various hyperparameter configurations. All the examples generated below were done with 600 steps on the projection step (generating the latent code) and 350 steps on the inversion (usually reaches the lpips threshold early, around 30-100 steps)

The crop2pano is set to 60 degrees for the fov and aspect ratio of the input is maintained but scaled down to fit in 480p

Here are some attached examples:

Input:

1. 9C4A2495-40e081c571

2. 9C4A1998-f2e3c43f1e

3. 9C4A2509-ad1651abdf

4.

test

Ground truth:

1.

2.

3.

4. 9C4A0205-437172d8ee

Generated: 1.

2.

3.

4.

I was wondering if you had any thoughts on what may be going wrong, or anything that can be modified to make the output more consistent with ground truth?

Many thanks in advance

Wanggcong commented 1 year ago

Hi, thanks for your attention. The goal of StyleLight is to generate plausible lighting given a FOV image. It could be hard to recover a consistent pixel-level indoor scene with ground truth.

tonetechnician commented 1 year ago

Thanks @Wanggcong for your response!

Yeah, I totally understand. I'm just wondering why your examples are getting closer to ground truth than mine. Do you think there is a parameter that is different? Particularly looking at example 4 as I believe it's the same input as your example the paper

Wanggcong commented 1 year ago

Two steps are considered after HDR generation. First, I remove the back strips at the bottom of the panoramas. Second, I wrap the panoramas just like Gardner et al. Learning to Predict Indoor Illumination from a Single Image, SIGGRAPH Asia, 2017..

For the second point, it is assumed that the object could not be placed at the center. Note that in a panorama, the camera is placed at the center of a sphere. If we would like to render an object far from the center, we need to warp the panorama to a new center. I would like to share this code if it is the problem you are interested in.

As I can see, example 4 you show is the warpping problem.

I am glad to help you if you have any questions.

Wanggcong commented 1 year ago

Hi, if you have no other questions, I will close this issue. If any, you can re-open this issue or open a new issue. Thank you.

Wanggcong / StyleLight

Evaluating the pretrained model #2