I'm anonymising long interview videos and so the time is limiting.
To improve this could you just use tracking, like is used in other object detection situations where you perform detection periodically and then tracking in the intermediate frames.
This should help in a couple of different ways:
it will make it more robust to occlusion where detection might fail
it will significantly improve performance, only performing detection every nth frame and tracking between these will really help.
it will help quite a bit with the detection jitter for the bounding boxes/blurred areas that it currently has.
Other optimisationsNumber of expected faces:
could include a parameter for the number of faces in the image. If you're just expecting 2-3 then as long as the track is good on those then you can have less frequent detection, again speeding up performance.
Minimum face size:
This could help if you're mainly looking for people in the foreground and help reduce false positives.
I'm anonymising long interview videos and so the time is limiting.
To improve this could you just use tracking, like is used in other object detection situations where you perform detection periodically and then tracking in the intermediate frames.
This should help in a couple of different ways:
Other optimisations Number of expected faces: could include a parameter for the number of faces in the image. If you're just expecting 2-3 then as long as the track is good on those then you can have less frequent detection, again speeding up performance.
Minimum face size: This could help if you're mainly looking for people in the foreground and help reduce false positives.