XiangZ-0 / EVDI

Implementation of CVPR'22 paper "Unifying Motion Deblurring and Frame Interpolation with Events"
72 stars 8 forks source link

Normalization Method #8

Closed chenkang455 closed 1 year ago

chenkang455 commented 1 year ago

Hi @XiangZ-0 , thanks for your marvelous work! I was a bit confused by the normalization method used to calculate the loss function in the part of the paper! image My understanding is that we should normalize the same pixel at different times. But in your code, what you're doing is normalizing the entire image at the same time.

for i in range(mid_events.shape[0]):
        if (mid_events[i,...].max() == mid_events[i,...].min()):  # no events, calculate two images
            norm_prev_imgs = util.normalize(log_prev_imgs[i,...], max_val=max_value)
            norm_next_imgs = util.normalize(log_next_imgs[i,...], max_val=max_value)
            L_S_E += loss(norm_prev_imgs, norm_next_imgs) 
        else:
            mask = mid_events[i,...] != 0
            diff_imgs = log_next_imgs[i,...] - log_prev_imgs[i,...]
            norm_diff_imgs = util.normalize(diff_imgs, max_val=max_value)
            norm_mid_event = util.normalize(mid_events[i,...], max_val=max_value)
            L_S_E += loss(norm_diff_imgs * mask, norm_mid_event * mask) # with mask
    L_S_E /= (N-1)

Could you point out the reason? Thanks a lot!

XiangZ-0 commented 1 year ago

Hello chenkang455,

Thank you for your interest in our work. In the sharp-event loss, we apply the normalization to the entire image since we assume that the event threshold $c$ is constant in both spatial and temporal domains within the time interval [f, t]. Replacing the global normalization with per-pixel normalization might be a potential boost to the algorithm as the latter does not require the threshold $c$ to be spatially constant, and you can experiment with this point if you are interested :)

chenkang455 commented 1 year ago

Thank you for your detailed response and for asking another question about the role of S-E (sharp-event) Loss. In your paper, you mentioned that its role is to solve the motion blur problem, but what is the significance of solving this problem in improving performance metrics? In an ideal scenario, where the SELoss of the final optimization network is zero, the relationship between different sharp images is the same as in the EDI model, but this relationship is greatly affected by event point noise, which can lead to a degradation of the model's performance metrics. In your paper, the performance improvement after adding SELoss was not very significant, so what is the significance of adding it to the loss function?

My understanding of this problem is that adding SELoss allows the model to learn a more reasonable relationship between events and images, and to some extent increases the model's generalization ability, preventing it from overfitting. However, as training progresses, SELoss will fall further, which leads to an increase in the effect of event noise generation. Therefore, it is very important to set the weighting relationship between losses, both to allow B-SLoss to fall and to ensure that SELoss does not fall too low. Thank you again for your detailed response! image

XiangZ-0 commented 1 year ago

Thank you for the question. As discussed in the ablation section of our paper, sharp-event loss and blurry-event losses are both designed to address motion ambiguity but from different perspectives. The role of the sharp-event loss can be validated by comparing the models in the first and the fifth rows in Tab. 3 of our paper, where adding sharp-event loss contributes to a 5-6 dB PSNR improvement. Although the performance gain of sharp-event loss is relatively less compared with that of blurry-event loss due to constant threshold assumption and event noise, we still add it in our paper since it may inspire future work to further explore and improve the constraints between events and frames.