tub-rip / event_based_image_rec_inverse_problem

Official implementation of IEEE TPAMI 2022 paper "Formulating Event-based Image Reconstruction as a Linear Inverse Problem with Deep Regularization using Optical Flow"
MIT License
31 stars 2 forks source link

Code for generating optical flows #1

Open wzpscott opened 1 year ago

wzpscott commented 1 year ago

Hi, thanks for your outstanding work. I'd like to try your method on my own event dataset without ground-truth optical flows. Can you release the code for generating optical flows? I also notice this work is evaluated on '_rotation' sequences on IJRR dataset while other works (e2vid, ecnn...) are evaluated on '_6dof' sequences. I wonder how well can this work adapt to complex camera motions.

EvilPerfectionist commented 1 year ago

Hello, thank you for your compliment and interest to our work.

I didn't include a specific optical flow estimation method into the code because I want to give the user the flexibility to run the algorithm with optical flow from any available sources. There are a couple of ways to generate optical flow: 1.https://github.com/TimoStoff/event_cnn_minimal 2.https://github.com/tub-rip/event_based_optical_flow 3.https://github.com/uzh-rpg/E-RAFT But of course you are free to choose other methods.

This work was only evaluated on rotation sequences because in these scenarios I could generate high quality optical flow which closes to ground truth. In rotation sequences, one can estimate high accurate angular velocity by CMax or obtain it from IMU data(IJRR dataset provides it). The optical flow can be computed from the angular velocity(See equation 7 from this paper). Please note that the equation can be simplified since in the case of rotations the linear velocity is 0 and it is unnecessary to know the depth.

For you last question, the algorithm can adapt the complex camera motions. One just needs to find a way to compute optical flow. The If the camera motion is complex, for example 6dof, one can still estimate the flow by deep learning methods. But the performance of the algorithm will depend on the quality of the optical flow the deep learning methods provide. You can find an example in the paper.

Please don't hesitate to ask if you have further questions.

wzpscott commented 1 year ago

Thanks for your detailed reply. May I ask which specific method do you use to generate optical flow to derive the results in the paper (Tab.1 and Tab.3)? Do you use cmax-based methods or get it from IMU data? I understand generating optical flows falls out of the scope of this paper, however I think it would be very helpful to include the code or instructions of it, since different methods do effect the quality of generated optical flow and 'the performance of the algorithm will depend on the quality of the optical flow' as you stated.

EvilPerfectionist commented 1 year ago

Hello, the results in Table 1 and Table 3 are obtained by using the optical flow estimated by CMax. And they will close to the results by using IMU as supported by this paper

To satisfy your demand, I will create another branch and show how to compute optical flow from angular velocity. But I would stick to get the angular velocity from IMU since IJRR dataset provides it. If one insists to try CMax(for example this one), he/she can integrate the code by his/her own. The instructions to estimate optical flow by deep learning methods are included in their own repositories. I think I have nothing to write. One just needs to save the events and optical flow when the deep learning model makes inference.

Integrating the code of optical flow estimation can take up to two weeks.

wzpscott commented 1 year ago

Thanks a lot for you reply.

wzpscott commented 1 year ago

Update: Hi, I have tried to estimate optical flows using the pretrained EvFlowNet as in ssl-e2vid. However the results are not very satifactory. Here are some results I reproduced on the Dynamic Rotation sequence. I use default configs of regularizers in main.py. Denoiser: 100_denoiser L1: 100_l1 L2: 100_l2 And here are my questions:

  1. Can you give any insights about the above results?
  2. Although you state that [...are obtained by using the optical flow estimated by CMax. And they will close to the results by using IMU...] some recent papers(eg this and this one) seems to suggest that the choice of reward functions and other parameters have a big impact of the performance of CMax, so I still think detailed implementations of optical estimation is necessary for the reproducibility of this paper.
  3. In your paper you mentioned that the proposed method naturally adapted to super-resolution. Can you give some instructions about that?

Thanks in advance.

EvilPerfectionist commented 1 year ago

Hello, I uploaded the code to compute optical flow from the angular velocity and wrote some instructions in the READ.ME. For your questions:

  1. The results are bad because the estimated flow is not good enough. Do all the reconstructed images from Dynamic Rotation sequence and other sequences look like the examples you provide? Or some of them look ok and some bad?
  2. I uploaded the code which includes the computation of optical flow from angular velocity(for the rotational case). Angular velocity can be obtained from IMU or CMax or ST-PPP. This repo contains the code to estimate the angular velocity by CMax or ST-PPP. I used it when I did the experiments. I would still stick to obtain the angular velocity from IMU because the estimated angular velocities by CMax are very close to the ones from IMU(See Figure 9 in the paper)
  3. When we warp the events with flow to create image of warped of events(IWE), the coordinates of events are not integer anymore. We use bilinear interpolation to spread their weights on to the pixel grid. If one wants to do super-resolution, one can simply increase the dense of pixel grid and vote the events on the denser pixel grid. For example, if the coordinate of a warped event is (0.5, 0.5), pixel (0, 0), (0, 1), (1, 0), (1, 1) will gain 0.25 each. After 2x super-resolution, we have more pixels and the size of the pixels is the half of the original one. in this case, this event will vote for pixel (1, 1) only. Hope it is clear.