AIRLegend / aitrack

6DoF Head tracking software
MIT License
1.03k stars 102 forks source link

Add center weighted face detection to version 0.6.6 using cv::FaceDet… #164

Closed searching46dof closed 1 year ago

searching46dof commented 1 year ago

…ectorYN

cv::FaceDetectorYN::>detect returns a matrix of faces where get_distance_squared will return the row that is closest to the center of the preview window. center_weighted should be made configurable since center weighted detection may not be desired under all circumstances.

searching46dof commented 1 year ago

Re-adding center weighted face detection intended only for cv::FaceDetectorYN

searching46dof commented 1 year ago

score_threshold and nms_threshold should be made configurable as well. Using cv::FaceDetectorYN appears to have more false positive face detection of non-faces. these parameters along with center weighting can increase the accuracy or range of the face detection.

The score_threshold when relaxed may provide better range of values and raise the threshold where it loses face detection. This would significantly help larger pitch ranges and pitch granularity needed for flight ACM when dogfighting.

The center weighting alone does not solve the issue. At extreme angles of yaw,pitch and roll AI track will lose face tracking of a center weighted face. It can then pick up secondary faces in the background which may have a completely different orientation. Partially overlapping faces (e.g. over the shoulder) can be masked by nms_threshold. False detection of non-faces (which seems dependent the lighting at the time of day) may also be masked by raising score_threshold.

Access to the full range of these values is not necessary. score_threshold currently has a value of .8f but the allowed range may be 0.6 to 0.9 (values above 0.5). nms_threshold currently has a value of .3f but the allowed range may be 0.1 to 0.4 (values below 0.5).

AIRLegend commented 1 year ago

Nice, thank you very much!! 🎉

Yeah, we could lower score_threshold (and add it to the config settings) as we now pick the centermost face. I take note :)

However, regarding the nms_threshold parameter, I've been testing these last days and I found it better keeping the bounding box as stable as possible (I mean, maintaining the same aspect ratio). That's because when the bounding box crop is passed to the landmark model it's downscaled to 224x224 (or even lower with the new model I'm testing), so changes in this aspect ratio "deform" the face, which in turn slightly deforms the predicted landmarks, and can introduce some "jumps" in the PositionSolver output.

In order to keep the BB stable, the best thing to do is being aggressive with the nms_threshold (i.e. low values) and also use few bounding box candidates (the top-k param), which I feel like ~15 should be a sweet spot.