Closed minshu-kim closed 2 years ago
In our paper, we judge the failure cases subjectively. Actually, I think we can give a quantitative standard by setting a threshold for the PSNR or SSIM. But I'm not sure if this standard can be applied to all the cases and acknowledged by other researchers.
I agree with the heuristics too. In UDHN (IEEE RAL 2018), they define the failure case as the image whose PSNR of overlapping region is lower than the one transformed by I_3x3 (Identity matrix). What do you think about this measure for counting failures? Excuse me again.
But if the overlap rate is very low for a case, the evaluation metric can hardly reach such a low value as that of I_3x3, even the evaluated method has failed in this case obviously.
So, in my view, how about drawing a curve to represent the robustness? In this fig, the x axis denotes the threshold values from low to high and the y axis denotes the successful rate.
That sounds reasonable. Thank you for your novel idea and help 👍
In your paper table 2, there are reports about the number of failure result images. Is there any quantitative standard to determine the failure cases?