Closed AndreaLanfranchi closed 6 months ago
Good question. This detector was added by a PR. Need to make some investigation. Then, will let yo know here.
Possibly I got the answer myself. According to MTCNN documentation:
image_size {int} -- Output image size in pixels. The image will be square. (default: {160})
This value is used only when MTCNN is invoked for the forward()
method (which does detection and extraction)
While instead the method detect()
is used the value of image_size
is totally irrelevant.
Besides I would underline that this comment:
select_largest=False, # return result in descending order
is misleading.
In fact, also according to documentation
select_largest {bool} -- If True, if multiple faces are detected, the largest is returned.
If False, the face with the highest detection probability is returned.
(default: {True})
This forces only the largest image to be returned unless also the argument keep_all
is valued to True.
As a result the enumerator in results is pleonastic as we expect (under this configuration) only 1 element or none
Ref
select_largest {bool} -- If True, if multiple faces are detected, the largest is returned.
If False, the face with the highest detection probability is returned.
(default: {True})
selection_method {string} -- Which heuristic to use for selection. Default None. If
specified, will override select_largest:
"probability": highest probability selected
"largest": largest box selected
"largest_over_threshold": largest box over a certain probability selected
"center_weighted_size": box size minus weighted squared offset from image center
(default: {None})
keep_all {bool} -- If True, all detected faces are returned, in the order dictated by the
select_largest parameter. If a save_path is specified, the first face is saved to that
path and the remaining faces are saved to <save_path>1, <save_path>2 etc.
(default: {False})
As a result I would remove the explicit assignements into the creation of the instance for MTCNN as they're exactly the default values assumed by the model.
Very helpful, ty.
Will sort source code tomorrow (most probably)
Also consider that, to be consistent with other models (which return all the faces detected on an image) this should also enforce the argument keep_all
to True: otherwise only one face is returned.
As a result the enumeration has to be kept.
Yeah that is a bug, detectors should return all faces not just one.
I do not think keep all arg is necessary because return type is list and many faces should be returned.
I just tested this with the following snippet. Seems it is working fine as is.
from deepface import DeepFace
import matplotlib.pyplot as plt
import cv2
img_path = "dataset/couple.jpg"
img = cv2.imread(img_path)
objs = DeepFace.extract_faces(img_path=img_path, detector_backend="fastmtcnn")
for obj in objs:
# plt.imshow(obj["face"])
x = obj["facial_area"]["x"]
y = obj["facial_area"]["y"]
w = obj["facial_area"]["w"]
h = obj["facial_area"]["h"]
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 255, 255), 1)
fig = plt.figure(figsize=(10, 10))
plt.imshow(img[:,:,::-1])
plt.show()
So, setting arg select_largest
to False is not causing to return just one face. Source documentation may not be upt-to-date or correct.
I plan to close this because this is not a bug if you will not raise anything else?
Ok then. Better this way. Apparently keep_all works only for extraction exactly like for image_size
May I close this?
Yes solves my doubt. Nevertheless I would change the instantiation line to
self._detector = fast_mtcnn(device="cpu")
where device is the only non-default value set and removes the ambiguity for image_size
which is actually not relevant in this scope.
Thank you for your contribution again
One more bit ... I would add a more safety validation here.
Actually reading the detect
implementation I see that the img
argument can be a 4 dimension image (hence multiple images) and if it's that the case then the tuple returned becomes an array of arrays hence invalidating this
for current_detection in zip(*detections):
x, y, w, h = self._xyxy_to_xywh(current_detection[0])
confidence = current_detection[1]
left_eye = current_detection[2][0]
right_eye = current_detection[2][1]
I believe that each detector is supposed to work on a single discrete images only.
Handled with PR - https://github.com/serengil/deepface/pull/1115
Consider this code happening in the initialization of the model:
The argument
image_size
is valued to a constant value == 160. However the methoddetect_faces
apparently does not take into account this value hence the question : do we have to resize the image to theimage_size
value before processing or is it automagically scaled by this ?Thank you