SensorsINI / v2e

V2E: From video frames to DVS events
https://sites.google.com/view/video2events/home
MIT License
290 stars 50 forks source link

event time stamps is not equal to origin video's fps #35

Closed oshima-yoppi closed 2 years ago

oshima-yoppi commented 2 years ago

Thank you for developing such a brilliant tool!!

The timestamps do not match the original video, even though it is set to no slow motion. Even though the video is at 100 fps, the event timestamps are not every 0.01s. I would like to know why this is so.

The command I used is as follows. python v2e.py -i input\0.avi --output_folder=output/tennis --dvs_text DVS_TEXT --dvs_exposure duration 0.01 --overwrite --timestamp_resolution=0.01 --auto_timestamp_resolution=False --output_folder=output/tennis --overwrite --pos_thres=.15 --neg_thres=.15 --sigma_thres=0.03 --dvs_aedat2 tennis.aedat --output_width=240 --output_height=180 --cutoff_hz=15 --disable_slomo --timestamp_resolution=0.01 --input_slowmotion_factor 1

The data output to DVS_TEXT.txt is as follows ... 0.044999998062849045 149 170 1 0.044999998062849045 37 82 1 0.044999998062849045 85 83 0 0.044999998062849045 58 67 1 0.05000000074505806 61 64 1 0.06000000238418579 112 120 0 0.06000000238418579 72 67 0 0.06000000238418579 156 179 1 ...

tobidelbruck commented 2 years ago

With --disable_slomo, v2e should just use the frame times for DVS event timestamps, unless there are more than 1 event per input frame. We had to choose some scheme for spreading these events over time between frames. The unfortunate side effect is a regular quantization of the timestamps with clumps of more events at the frame times. These additional events are spread over the the time between frames.  I.e., pixels that make only 1 event will have 10ms timestamp. For each frame difference, v2e finds the pixel(s) with the most number of events, e.g. 9 events. Then those pixels will have 1ms timestamps and other pixels with smaller change will make fewer events, down to pixels with only a single event.  You will notice that more pixels will make timestamps with 10ms quantization since they make only 1 event. This explanation is probably not so clear, but try to visualize the output events with 3d spacetime view and you will notice pyramids of events.   Does this explanation help clarify what is going on?

tobidelbruck commented 2 years ago

I have added clarifying video to README.md.