cosanlab / py-feat

Facial Expression Analysis Toolbox
https://py-feat.org/
Other
260 stars 71 forks source link

Image size degrades emotion classification accuracy when holding face pixel size constant #135

Open markallenthornton opened 2 years ago

markallenthornton commented 2 years ago

Using v0.4.0 or the current m1_testing branch, using the default detectors specified in the documentation, I'm encountering an issue where using large images seems to degrade performance, when holding the pixel size of the actual faces constant. Simply cropping out face-free parts of the image improves performance considerably. I suspect that this might be happening because the image is downsampled for face detection, and then when the faces are extracted using the resulting bounding boxes, the downsampled rather than original image is used. This would result in the faces being unnecessarily downsampled in large images that are mostly free of faces, leading to degraded performance. If this is the problem, I would suggest upsampling the bounding boxes back to the original image resolution, and then extracting the faces from the original. They could always be downsampled from this point if necessary for the emotion model, but at least it wouldn't be based on something arbitrary like the overall image size.

ejolly commented 1 year ago

Thanks for the suggestion!

To partially address this issue (among others), since 0.5.0 we support passing kwargs to the underlying pre-trained detectors during initializing and prediction. But will continue pursuing more robust solutions.

ejolly commented 1 year ago

Related: #73