Xiaofeng-life / SFSNiD

30 stars 1 forks source link

The eval results of the "train_SFSNiD_supervised.py" are strange. #6

Open jm-xiong opened 2 months ago

jm-xiong commented 2 months ago

I train train_SFSNiD_supervised.py on UNREAL-NH dataset using the parameters provided by the authors. When I evaluate whole size, the psnr tends to 16.25. But I evaluate the size of 256256, the results can be 25.07? Why are the val data resized to 256256 to calculate the indexs? I would appreciate it if you could help me out.

Xiaofeng-life commented 2 months ago

I train train_SFSNiD_supervised.py on UNREAL-NH dataset using the parameters provided by the authors. When I evaluate whole size, the psnr tends to 16.25. But I evaluate the size of 256_256, the results can be 25.07? Why are the val data resized to 256_256 to calculate the indexs? I would appreciate it if you could help me out.

As far as I know, the size of UNREAL-NH training set images is 480x480. This size is not significantly different from 256. Perhaps you can train from scratch and make the training and test images of the same size. In my paper, I made all images of the same size for a fair comparison.

jm-xiong commented 2 months ago

Thank you for your reply. As far as I know, it is traditional operations to train model by resizing the training images, and then use its checkpoint to test images with whole size. Why are psnr so different when I just change the size of the test images? As you say, I can train from scratch and make the training and test images of the same size, but it's inflexible when training multiple datasets with different input sizes. Besides, for a fair comparison, both the training and the test set in SFSNiD are resized to 256. And are the other comparison methods treated in the same way? I would appreciate it if you could help me out.

Xiaofeng-life commented 2 months ago

Thank you for your reply. As far as I know, it is traditional operations to train model by resizing the training images, and then use its checkpoint to test images with whole size. Why are psnr so different when I just change the size of the test images? As you say, I can train from scratch and make the training and test images of the same size, but it's inflexible when training multiple datasets with different input sizes. Besides, for a fair comparison, both the training and the test set in SFSNiD are resized to 256. And are the other comparison methods treated in the same way? I would appreciate it if you could help me out.

(1) "And are the other comparison methods treated in the same way?" The reason is that some algorithms cannot accept real-world images of arbitrary sizes as input. In our paper, real-world evaluation is an important part. So we set all algorithms to have the same output size. (2) "When I evaluate whole size, the psnr tends to 16.25. But I evaluate the size of 256256, the results can be 25.07?" This is a big change. It doesn't count as a "slight change". I suggest you turn on data augmentation (like random crops). Alternatively, you could check your evaluation code. Changing from 256 to 480 should not reduce the PSNR to 16.

jm-xiong commented 2 months ago

Thank you for your reply. As far as I know, it is traditional operations to train model by resizing the training images, and then use its checkpoint to test images with whole size. Why are psnr so different when I just change the size of the test images? As you say, I can train from scratch and make the training and test images of the same size, but it's inflexible when training multiple datasets with different input sizes. Besides, for a fair comparison, both the training and the test set in SFSNiD are resized to 256. And are the other comparison methods treated in the same way? I would appreciate it if you could help me out.

(1) "And are the other comparison methods treated in the same way?" The reason is that some algorithms cannot accept real-world images of arbitrary sizes as input. In our paper, real-world evaluation is an important part. So we set all algorithms to have the same output size. (2) "When I evaluate whole size, the psnr tends to 16.25. But I evaluate the size of 256256, the results can be 25.07?" This is a big change. It doesn't count as a "slight change". I suggest you turn on data augmentation (like random crops). Alternatively, you could check your evaluation code. Changing from 256 to 480 should not reduce the PSNR to 16.

Thank you for your reply. For (1), Resizing fixed size for the real-world evaluation is reasonable, but I'm confused about the synthetic data, like GTA5, and UNREAL-NH. It seems only to use the supervised branch to train synthetic data. For synthetic data, are the results in Table 2 evaluated with 256 x 256 in all comparison algorithms? For (2), without modifying any code, I train train_SFSNiD_supervised.py on UNREAL-NH and then adopt the inference_real_world.py to evaluate whole size, only commenting out the code “img = img. resize(img_size)" . The psnr tends to 16.25. But I evaluate the size of 256 x 256, the results can be 25.07. Is there something wrong here? I would appreciate it if you could help me out.

Xiaofeng-life commented 1 week ago

Hi, sorry for taking so long to reply you. As far as I know, there won't be so much difference. But you can try using other dehazing networks, such as DehazeFormer. You can directly replace my network structure. UNREAL-NH itself cannot be directly used for nighttime dehazing. You can refer to the original results of UNREAL-NH. There are relatively few studies in this field (nighttime dehazing). I noticed that some datasets cannot be trained using the original size because the pixels are not aligned.