YuvalNirkin / find_face_landmarks

C++ \ Matlab library for finding face landmarks and bounding boxes in video\image sequences.
144 stars 44 forks source link

Clarification about Bounding Box Creation #23

Closed dtoniolo closed 2 years ago

dtoniolo commented 2 years ago

First of all thanks for the great project. I was trying to replicate some results, but I have a couple of doubts about the way bounding boxes are obtained from dlib landmarks in bbox_from_landmarks() from find_face_landmarks/interfaces/matlab/bbox_from_landmarks.m:

function bbox = bbox_from_landmarks(landmarks, frameWidth, frameHeight, square)
%BBOX_FROM_LANDMARKS(landmarks, frameWidth, frameHeight, square) Compute
%   bounding box from landmarks.
%   Input:
%   Output:
%       bbox - Output bounding box [minx miny width height].

%% Parse input arguments
if(~exist('square','var'))
    square = 1;
end

%% Calculate bounding box
minp = min(landmarks);
maxp = max(landmarks);
size = double(maxp - minp + 1);
center = double((maxp + minp)/2);
avg = round(mean(landmarks));
dev = center - avg;
dev_lt = round([0.1*size(1) size(2)*(max(size(1)/size(2),1)*2-1)]) +...
    abs(min(dev,0));
dev_rb = round(0.1*size) + max(dev,0);

%% Limit to frame boundaries
minp = max(double(minp) - dev_lt, 1);
maxp = min(double(maxp) + dev_rb, [frameWidth frameHeight]);

%% Make square
if(square)
    size = maxp - minp + 1;
    sq_size = max(size);
    half_sq_size = round((sq_size - 1)/2);
    center = round((maxp + minp)/2);
    minp = center - half_sq_size;
    maxp = center + half_sq_size;

    % Limit to frame boundaries
    minp = max(minp, 1);
    maxp = min(maxp, [frameWidth frameHeight]);
end

%% Output bounding box
bbox = [minp (maxp - minp + 1)];

end

First of all, the bounding box is simply initialised by finding the minimum and maximum coordinates along each dimension. However, immediately afterwards its size is adjusted using the distance between its center and the average location of the keypoints (dev). Could you please explain why this correction is made? Does it make the bounding box more robust in some manner?

Ignoring for the aforementioned correction for a moment, it is common to expand the size the size of the bounding box around its center by some value x (0.2 in this case), e.g.

size = double(maxp - minp + 1);
half_size = (size - 1)/2;
center = (maxp + minp)/2;
minp = center - half_size;
maxp = center + half_size;
bbox = [minp (maxp - minp + 1)]

this operation is quite simple and preserves the aspect ratio. However, in bbox_from_landmarks() the height coordinate is corrected differently (by size(2)*(max(size(1)/size(2),1)*2-1) instead of 0.1 * size(2)). Could you explain the rationale for this alteration?

Thanks in advance, Davide

YuvalNirkin commented 2 years ago

It is just a heuristic for keeping the face in the middle of the bounding box, there is no real mathematical justification behind it.

dtoniolo commented 2 years ago

Got it, thanks again!