Reproducing Ship Object Results from Tetra-NeRF Paper

zheruiqiu commented 10 months ago

Dear Jonas Kulhanek,

I am currently working on replicating the results presented in your Tetra-NeRF paper, specifically those related to the Blender dataset's ship object. In the original paper, Tables 2 and 6 report an SSIM (Structural Similarity Index Measure) score of 0.994 for this object. However, despite closely following the instructions outlined in the README on your GitHub page, I have been unable to achieve an SSIM score higher than 0.9.

I am reaching out to seek guidance or insights that might help me align my results with those reported in the paper. I would greatly appreciate any advice or recommendations you could provide that might help accurately replicate the reported results. Thank you for your time and for the impressive work on Tetra-NeRF.

Looking forward to your response. ssim_skimage ssim_nerfstudio

jkulhanek commented 10 months ago

Hi,

sorry you are having issues. Can you please share other metrics and images? I will look into it.

zheruiqiu commented 10 months ago

Dear Jonas,

Thank you again for your willingness to assist. As per your request, I have compiled a detailed set of results to provide a clearer picture of the issues I'm encountering:

Evaluation Loss: Attached are the graphs showing the evaluation loss over epochs.
SSIM and PSNR Metrics: I have included a summary of the SSIM and PSNR metrics obtained.
Depth Images of Ship Object: I am also attaching a series of depth images of the ship object.

Looking forward to your guidance.

jkulhanek commented 10 months ago

Ok, PSNR for that scene seems to be consistent. What implementation of SSIM are you using?

jkulhanek commented 10 months ago

For the blender dataset, we took the implementation from pointnerf to be able to compare with them. https://github.com/jkulhanek/tetra-nerf/blob/578a34bc0a4f00a43c9e4a3598396ef8db442cc4/tetranerf/nerfstudio/model.py#L61 Are you plotting the same metric?

jkulhanek commented 10 months ago

I see; you report all. It is not consistent

jkulhanek commented 10 months ago

Can you please run the full evaluation and report the three SSIM scores for the test split?

jkulhanek commented 10 months ago

These were my results on the val set with nerfstudio's SSIM implementation. These are consistent with yours. The difference is in skimage's implementation of SSIM.

jkulhanek commented 10 months ago

Ok, I think I see where the problem is. If you compare the pointnerf impl: https://github.com/Xharlie/pointnerf/blob/ff10fac8081a852c2942347bc8e5cd6183159ebe/run/evaluate.py#L10 and the tetranerf's one, they are not the same. As I tried to upgrade the code to newer skimage version, I did something wrong. Would it be possible to compare the images using the pointnerf's implementation (with older skimage version)? I think you will get results that are consistent with the paper.

jkulhanek commented 10 months ago

In particular, I think pointnerf uses datarange=2. and channel_axis=none here: https://github.com/jkulhanek/tetra-nerf/blob/578a34bc0a4f00a43c9e4a3598396ef8db442cc4/tetranerf/nerfstudio/model.py#L64 what do you think?

zheruiqiu commented 10 months ago

Dear Jonas,

Thank you for your follow-up. Regarding the SSIM implementation, I have utilized both nerfstudio_ssim and skimage_ssim as per the Tetra-NeRF codebase. The relevant code snippet is as follows:

metrics_dict = {
    "psnr": float(psnr.item()),
    "nerfstudio_ssim": float(self.nerfstudio_ssim(image, rgb)),
    "skimage_ssim": float(self.skimage_ssim(image, rgb)),
    "lpips": float(lpips),
}

This code can be found in the file model.py of the Tetra-NeRF repository, specifically at this link.

Thank you for the detailed explanation and the suggestion to use PointNeRF's SSIM implementation with the older version of skimage. I will follow your guidance and run a full evaluation using this method. I will report back with the SSIM scores as you requested. And it would be immensely helpful if you could also run a parallel evaluation on your end after updating the code. This would provide an additional layer of verification and might offer further insights into the consistency and accuracy of the results.

Additionally, I wanted to share my observations regarding the SSIM scores obtained using the nerfstudio_ssim implementation, which were between 0.885 to 0.890 for the ship object (as the result you've provided). These scores seem quite distant from the 0.994 reported in the paper. While I understand that different implementations can yield varying results, such a significant discrepancy seems unusual.

In light of this, I agree with your suggestion to conduct a comparison using PointNeRF's implementation. However, I still believe it might be beneficial to also review the SSIM scores for other objects in the dataset. This could help ensure the accuracy of reported values across different cases and offer a more comprehensive understanding of the overall situation.

I appreciate your help in navigating these challenges and look forward to sharing my findings after conducting the evaluations.

Finally, regarding the _channelaxis parameter (for _structuralsimilarity function), according to the skimage documentation, setting channel_axis to None would treat the image as a grayscale image, which might not be appropriate for our case as we are dealing with multichannel (RGB) images.

Once again, thank you for your insights and collaboration. As a side note, I wanted to mention that it's currently 3:00 AM in my time zone, so I will be heading to bed shortly. Therefore, there might be a delay in my response to any further messages. I look forward to continuing our discussion after I have had the chance to rest.

jkulhanek / tetra-nerf

Reproducing Ship Object Results from Tetra-NeRF Paper #26