Parallelize inference on different frames for videos

anth2o commented 4 years ago

When doing inference on a video, we need to make object detection inference on each frames we splitted. For a video of 1 minute and 4 fps of inference frames, it means 460=240 pictures to make inference on. The average inference speed for the FPN is 120ms / img, which means the inference for a single video is 240 0.12 = 30 seconds. There is also the time to open pictures on disk.

I imagine 2 solutions to improve inference time:

do not store pictures on disk when splitting the video, and directly do the inference for each frames
multi process the inference

anth2o commented 4 years ago

Actually, the GPU is almost not used when doing inference on a video. The main bottleneck of inference is opening the images and/or resizing them before sending them to the object_detection model

anth2o commented 4 years ago

Performances with multi processing for 343 frames and 1 GPU:

8 CPUs: 02:47
16 CPUs: 01:54
32 CPUs: 01:23

surfriderfoundationeurope / mot

Parallelize inference on different frames for videos #13