zju3dv / ENeRF

SIGGRAPH Asia 2022: Code for "Efficient Neural Radiance Fields for Interactive Free-viewpoint Video"
https://zju3dv.github.io/enerf
Other
413 stars 28 forks source link

Question about quantitative evaluation on pretrained model #24

Closed Tiansong97 closed 1 year ago

Tiansong97 commented 1 year ago

I used the provided generalization model to perform evaluation on DTU dataset as in readme, and got the psnr, ssim and lpips values: image

whereas in readme, the quantitative results should be image

I wonder why the quantitative evaluation results are different? And I want to know your evaluation results? Thanks

haotongl commented 1 year ago

Hi, this result is very strange, can you provide me with more details? For example, pytorch version, operating environment.

Besides that, I also doubt whether the pre-trained model is loaded normally. Can you make sure that this line of code runs successfully? https://github.com/zju3dv/ENeRF/blob/38c1b9087833926de897847636016b73f889d22b/lib/utils/net_utils.py#L443

Tiansong97 commented 1 year ago

Thanks for your reply. Yes the pretrained model is loaded successfully. To check this problem, we set "strict=True" and get the same results.

We run the code on an ubuntu server and Nvidia 3090 GPU. Pytorch version is the same as in readme, and other packages that may have influence are show as following: image image image

All the evaluation outputs on DTU dataset are shown as following: 企业微信截图_16775885222453

The rendered images seem to be reasonable: Scan114_32_0.png scan114_32_0 scan45_44_0.png scan45_44_0

What's more, the evaluation results on nerf_llff_data (32 images for evaluation in total) and nerf_synthetic_data (32 images for evaluation in total) are also different from the psnr, ssim & lpips results that are shown on paper: image image image

Also, the rendered images seem to be reasonable: chair_32_0.png (same as the image shown on the supplementary materials) chair_32_0 fortress_25_0.png fortress_25_0

We are writing a paper and prepare to cite your paper and compare with yours results, so we want to check this problem. Thanks.

haotongl commented 1 year ago

While the model appears to perform well on the other two datasets, there may be an issue with the format of the DTU dataset when it comes to the artifacts. Based on the provided rendering images, which are clearly 512x640 in size, it is possible that the camera pose scale is incorrect. To confirm this, could you please review the content of $workspace/dtu/Cameras/train/00000000_cam.txt.

0.970263 0.00747983 0.241939 -191.02
-0.0147429 0.999493 0.0282234 3.28832
-0.241605 -0.030951 0.969881 22.5401
0.0 0.0 0.0 1.0

intrinsic
361.54125 0.0 82.900625
0.0 360.3975 66.383875
0.0 0.0 1.0

425.0 2.5
Tiansong97 commented 1 year ago

We follow the readme to download the dtu, llff, nerf dataset. The camera paramters are shown as follows. It seems that there is no difference to your data. image

We download the code, pretrained model and dataset, then we run the rendering command directly after specifying the dataset path. We didn't modify the dataset or the code.

haotongl commented 1 year ago

I'm sorry, it's my fault. I must have introduced a bug during my later update. However, I need to go to bed now and don't have time to locate where the bug is. A quick but temporary solution is to run "git checkout 2d6b3b2" and then execute the evaluation command.

I will address the issue tomorrow and update it on the master branch.

haotongl commented 1 year ago

I have solved this bug.