Regarding the issue of abnormal LPIPS and CLIPIQA scores in reproducing natural image inpainting task

stzyh commented 6 months ago

Hi,I'm using the publicly available pre-trained model: resshift_inpaint_imagenet_s4.pth And the recommended command:python inference_resshift.py -i [image folder/image path] -o [result folder] --mask_path [mask path] --task inpaint_imagenet --scale 1 --chop_size 256 --chop_stride 256 --bs 32 I'm testing on the ImageNet-Test dataset with the addition of masks to create low-quality images. However, I'm noticing significant differences between the computed LPIPS and CLIPIQA scores using the IQA-Library and the expected values: LPIPS: 0.7797829709947109 (expected: 0.2298), CLIPIQA: 0.6490718757808208 (expected: 0.4519).

Could you please advise if I have overlooked anything in my experiments? I would greatly appreciate your help!

zsyOAOA commented 6 months ago

Could you successfully run the colab? @stzyh

stzyh commented 6 months ago

Thank you！I can successfully run the project on my server and generate the corresponding images. I'd like to add some information: When I followed the published code and commands to execute the training procedure, I found that during training, the validation set performance was: LPIPS=0.3680. However, when I use that trained model to generate images and calculate the LPIPS using the IQA-Library, I get a value of 0.7797853878140449. Is it possible that I have incorrectly used the image generation script?

zsyOAOA commented 6 months ago

I guess there are some bugs in your testing script. Please have a careful check!

stzyh commented 6 months ago

Thank you for your help. I'd also like to ask if when I use the provided testing command, should I input the LQ image with a mask applied or the HQ image without a mask? I want to check if my image input is incorrect.

zsyOAOA commented 6 months ago

In my testing script, you should input the LQ image with a mask applied.

stzyh commented 6 months ago

Thank you very much for your guidance in the previous period! I have debugged and revised my code for computing metrics, and the current results have been improved compared to the previous ones. Currently, I am using the LPIPS library to calculate the results between the images generated by the published pre-trained model and the reference images, and the obtained results are: 微信图片_20240515142309 （LPIPS =0.24801460076821968, but expected: 0.2298）. Subsequently, I used the IQA-Library to calculate the results on the corresponding dataset, and obtained the following results: 微信图片_20240515143015 （LPIPS = 0.2677929490900133, but expected: 0.2298）The discrepancy between these two results is indeed confusing, and it's difficult to determine which one is the more reliable computation. If you would be kind enough to share your scripts that you used for calculating various metrics in natural image restoration tasks, I would greatly appreciate it. I sincerely hope to follow your excellent research and humbly await your response. Here is my email address: [13723444046@163.com].

zsyOAOA commented 6 months ago

I have uploaded the script to calculate metrics for inpainting for your information. @stzyh

zsyOAOA / ResShift

Regarding the issue of abnormal LPIPS and CLIPIQA scores in reproducing natural image inpainting task #67