Open kiven111 opened 1 year ago
https://github.com/xiaogang00/SMG-LLIE/blob/main/datasets/LOL_real.py#L36 https://github.com/xiaogang00/SMG-LLIE/blob/main/configs/transforms_config.py There are several transformers for gt, and don't transform back after inference.
Thank you for your response. I added the sort function in the dataloader-related code to address the issue.
Hi, have you figure out what's going on? It seems that the author modified the gt to get the results in the paper, which maybe unfair for other methods.
Hi, have you figure out what's going on? It seems that the author modified the gt to get the results in the paper, which maybe unfair for other methods.
The resolution of this model for resulting images is limited to 512x512. In order to test evaluation metrics like PSNR, it is necessary to resize the Ground Truth (GT) to the same resolution. Additionally, there is a missing sorting function in the data loading code, such as 'sort.(),' which leads to a mismatch between the generated images and the images in the original folder. Therefore, it is necessary to generate new GT images in order to calculate evaluation metrics.
Thanks for the explanation and I agree that the image should be resized and the "sort()" function should be included to match corresponding images. However, I notice that in https://github.com/xiaogang00/SMG-LLIE/blob/main/configs/transforms_config.py, the image is processed by transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]). I am wondering that it is normal and reasonable to process the gt images using this code? I think it would normalize the brightness of gt and predicted images to a relatively fixed value, which will considerably improve the final performance since low-light image enhancement partly aims to correct the brightness of low-light images to that of gt ones.
Your observation is quite astute, and I didn't notice this during the test. If possible, I look forward to the author's response to this question.
Hi, guys. What suggestion if I report the results of SMG-LLIE in a new paper. Use the ones reported in this paper or reevaluate the results.
After running the test_LOL_real.sh file, three folders will be generated under the results_image_generation_LOL_real directory: --gt --input --output
Then, by testing the PSNR and SSIM metrics using the generated output and gt, I can obtain consistent results with those mentioned in the paper. The results can be summarized as: ===> Avg.PSNR: 24.6227 dB ===> Avg.SSIM: 0.8219
However, when I replaced the 'gt' generated in your code with the ground truth provided by the official LOLv2 dataset and recalculate PSNR and SSIM metrics, the metrics significantly decreased as: ===> Avg.PSNR: 16.0604 dB ===> Avg.SSIM: 0.4606
Are the metrics evaluated in your paper considered objective and reasonable? Could you explain the inconsistency between the Ground Truth used in the code's testing metrics and the officially provided one? The author needs to provide a reasonable interpretation of the results in the paper.