XiangZ-0 / EVDI

Implementation of CVPR'22 paper "Unifying Motion Deblurring and Frame Interpolation with Events"
67 stars 5 forks source link

Some questions about experiments #1

Open Pei233 opened 2 years ago

Pei233 commented 2 years ago

Hi, authors.

Thank you for the excellent work about event-based deblurring. There are some questions about the comparison experiments.

  1. I find some differences in the quantitative results among existing methods. How do you obtain these results, such as LEDVDI, eSL-Net, and RED? Did you retrain or implement the compared methods on your datasets?
  2. I saw Sequence prediction comparison of Table 1 in 'Motion Deblurring with Real Events.' But I am not sure it has the same meaning as Table 2 in your paper?
XiangZ-0 commented 2 years ago

Hello Pei233,

Thanks for your interest in our work!

  1. For the learning-based methods, we use the official codes and models provided by the authors for comparison. The differences may be caused by some details in blur and event synthesis such as blur degree, image resolution, and frame rate conversion. In our experiments, all comparisons are conducted under the same dataset and configuration.
  2. They don't mean the same. Actually, the Sequence prediction comparison in 'Motion Deblurring with Real Events.' is similar to the setting of our deblurring experiment shown in Table 1. Table 2 in our paper shows the quantitative results of the frame interpolation task, with the experimental details presented in Section 5.3.

Hope it helps with your research :-)

Best, Xiang

Pei233 commented 2 years ago

Thank you for the reply.

For Q1, thus, you evaluate the existing pre-trained models on your datasets. For Q2, another question is, what's the difference between the Single frame prediction comparison and Sequence prediction comparison? And Table 1 in your paper belongs to the single or sequence prediction? According to the details in Section 5.2., it seems to belong to the sequence prediction?

BTW, are you convenient for sharing the codes of experimental data generation? For example, the generation of the .npz file from the HQF dataset (.bag format).

Thanks.

XiangZ-0 commented 2 years ago

Yes, the results in Table 1 of our paper are obtained under Sequence prediction. In my understanding, Single frame prediction means that only one latent frame per blurry input (e.g., the latent frame at the middle of the exposure time of the corresponding blurry frame) is used to evaluate the performance, while Sequence prediction means that a sequence of restored latent frames are used for evaluation, which is more challenging since it requires the deblurring method to handle temporal ambiguity. For the data generation, we may release the related code in the future. You can also follow the instructions in our paper (Section 5) to construct the dataset.

Pei233 commented 2 years ago

Hello, Xiang. When I try to create a new dataset to train the network as yours, I'm confused about the timestamps of blurry inputs, such as 'exp_start1', 'exp_end1', 'exp_start2', and 'exp_end2'. How do we obtain these values when generating blurry images? How to build the temporal relationship between the blurry images and event data? And the events during the subset dataset, for example, one sequence in the GoPro dataset, are stored in one .npy file with blurry images? Or does one .npy file ( line 78 in Dataset.py) only contains two consecutive images' information?

XiangZ-0 commented 2 years ago

Hi Pei233, 'exp_start' and 'exp_end' represent the exposure start time and end time of the corresponding blurry frame, respectively. For synthetic datasets, we synthesize blurry frames by averaging a sequence of sharp frames, and 'exp_start' and 'exp_end' are set as the timestamps of the first and last frames in the sharp sequence. For example, if you synthesize a blurry frame using 5 sharp frames with timestamps equal to [0.1,0.2,0.3,0.4,0.5], then 'exp_start'=0.1 and 'exp_end'=0.5 for the blurry frame. In our implementation, one .npy file only contains two consecutive blurry images' information, and the event data spans from the exposure start time of the left blurry frame to the exposure end time of the right blurry frame, i.e., 'exp_start1' to 'exp_end2'.