tub-rip / event_based_optical_flow

The official implementation of "Secrets of Event-based Optical Flow" (ECCV2022 Oral and IEEE T-PAMI 2024)
GNU General Public License v3.0
138 stars 13 forks source link

Why using fixed number of events for training #15

Closed AHupuJR closed 1 year ago

AHupuJR commented 1 year ago

Hi, Shiba, Why do you use a fixed number of events in optimization? As showed in config files (.yml files), the 'n_events_per_batch' is fixed, and events closer to the start time are abandoned if events are more than 'n_events_per_batch' according to the codes. In fact we want to predict the flow with fixed time interval instead of fixed number of events. According to the loss function below, the loss is computed with two iwe, which are in the shape of h,w, which is fixed for any 'n_events_per_batch'.

https://github.com/tub-rip/event_based_optical_flow/blob/1f40c39e6012d86507d313eff26cdc2c7b1503e7/src/costs/normalized_gradient_magnitude.py#L77

I just want to discuss about the reason behind it. Danke!

shiba24 commented 1 year ago

Hi @AHupuJR , The cost function is not related to the choice of the event batch. But imagine if a batch includes only one event, then the event_image will be almost all zeros, which does not make sense to calculate the sharpness. So, the number of events is fixed to ensure enough motion in one batch. Indeed, for MVSEC and DSEC benchmarking, one would need to get the timestamp of the evaluation and estimate the displacement between the start_time and end_time. You can check how I handle this in main.py. Does this answer your question?

AHupuJR commented 1 year ago

Hi, Shiba. Thank you for your reply.

I got the point why using fixed number of events is more reasonable than using fixed time interval because the events in a fixed time interval may be too few to estimate the flow. But using the scripts in main.py, if there are to many events, far more than the events in a certain time interval, the flow predicted in this time period is closer to the flow near the end time. Is the number of number of events in a batch specially selected? like an average number of events in a time period.

Hope you have a nice weekend! Danke schon

shiba24 commented 1 year ago

I'm not sure if I understood your question correctly, but, what is called "Optical flow benchmark", i.e. MVSEC or DSEC, is in a sense "Pixel displacement benchmark" if this makes more sense to you. What is evaluated is in the unit of [pix], not [pix/s] (velocity)). That is why I have both batch_for_gt_slice and batch_for_optimization in the main.py.

AHupuJR commented 1 year ago

I think I know how the OF using batch_for_gt_slice is predicted now. You use batch_for_optimization to optimize the OF, and simply times the timescale, which equals to the flow_time, i.e. the dt of GT flow.

If i want to predict the flow between two frames using my own events without gt flow, I need only to time best_motion with my dt. Then how to determine my n_events_per_batch in my data? using the number in yml(30000) ? or adjust it according to the size of my frame? If I use frame with the size of 262x320, I think it is about the same number. What if I use the resolution 720x1280, should I also increase n_events_per_batch?

If the my dt is quite small, like 0.1 (because I want to predict the flow within the exposure time), can i set the n_events_per_batch much smaller like 3000?

どうもありがとうございます

shiba24 commented 1 year ago

Then how to determine my n_events_per_batch in my data?

This is not an easy question to answer. There are a bunch of works that tackle how many events one should choose for better estimation. For example, the exact same motion can cause very different numbers of events due to the texture of the scene. Imagine that you have a scene with very fine textures - then you will suddenly get a large number of events even during 0.1 seconds.

You would need to try and error n_events_per_batch for what you want to do. I assume you want to try this method for deblur - then as a user of the method, two perspectives that I can give you are:

So, try to maximize n_events_per_batch as long as the first assumption holds true. Also, I would suggest to still decoupling batch_for_optimization and batch_during_expose (so to speak), estimate batch_for_optimization, and scale it to batch_during_expose if batch_during_expose gives you too few events for stable estimation.

And a bit more practical notes: I actually tried this method on the 1 Megapixel event camera (currently it's under review so I can't share the details). It works quite well. For usual scenes of moving cameras, I would use somewhere between 1M-2M events for optimization. Even more could work but I'm not sure.

If you want to know further details you can write me an email: sshiba[at]keio.jp or shiba.shintaro[at]gmail.com Hope this helps your research!

AHupuJR commented 1 year ago

阿里嘎多, shiba! thank you for the answer! Hope everything goes well with your research! Best, Lei