dlunion / DBFace

DBFace is a real-time, single-stage detector for face detection, with faster speed and higher accuracy
1.35k stars 301 forks source link

Bug Report: non-existing or invisible landmarks will crash the training program #60

Open SpaceView opened 2 years ago

SpaceView commented 2 years ago

To correct this probblem, changes are made in the following link (where a "bymc" keyword can be found nearby) https://github.com/SpaceView/DBFace/blob/master/train/small/train-small-H-keep12-ignoresmall.py

Description

In parse_facials_webface(facials), it is clearly given that if a landmark doesn't exist, then it is not recorded. E.g., you may just have half a face in an image, then the landmarks may be less than 5 points

     if len(facial) >= 19:
            landmarks = []
            for i in range(5):
                x, y, t = facial[i * 3 + 4:i * 3 + 4 + 3]
                if t == -1:
                    landmarks = None
                    break #(continue should be better,if used landmark_gt_mask recording method should be updated consequently)

                landmarks.append([x, y])

while in LDataset.getitem, this program will crash if number of landmarks is not 5,

if obj.haslandmark:
    reg_landmark = np.array(obj.x5y5_cat_landmark) / stride
    x5y5 = [cx]*5 + [cy]*5
    rvalue = (reg_landmark - x5y5)
    landmark_gt[0:10, cy, cx] = np.array(common.log(rvalue)) / 4
    landmark_mask[0, cy, cx] = 1    

Changes made

add a landmark_gt_mask

landmark_gt_mask    = np.zeros((1 * 10,fm_height, fm_width), np.float32)  # bymc

to indicate any special points where landmark values doesn't exist

if obj.haslandmark:
    # NOTE: actual landmarks could be less than 5, since somelandmarks may be invisible
    #       ref. parse_facials_webface
    # bymc
    reg_landmark = np.array(obj.x5y5_cat_landmark) / stride # obj.x5y5_cat_landmark == 10 landmarks (x1, x2,..., x5, y1, ..., y5)
    REGLEN = len(reg_landmark)   # NUMBER of landmark data
    REGPT = int(REGLEN/2)            # NUMBER of actual landmark points
    #x5y5 = [cx]*5 + [cy]*5 # e.g. [cx, cx, cx, cx, cx, cy, cy, cy, cy, cy], center in feature map
    x5y5 = [cx]*REGPT + [cy]*REGPT # e.g. [cx, cx, cx, cx, cx, cy, cy, cy, cy, cy], center in feature map
    rvalue = (reg_landmark - x5y5) # relative x y pos
    #landmark_gt[0:10, cy, cx] = np.array(common.log(rvalue)) / 4
    landmark_gt[0:REGLEN, cy, cx] = np.array(common.log(rvalue)) / 4
    if(REGLEN<10):                    
        landmark_gt[REGLEN:10, cy, cx] = 0
        landmark_gt_mask[REGLEN:10, cx, cy] = 1
    landmark_mask[0, cy, cx] = 1

Now the non-existing landmark_gt should contribute nothing to wingloss, thus

landmark_gt_mask    = landmark_gt_mask.to(self.gpu_master) # bymc
landmark_gt = landmark_gt + landmark * landmark_gt_mask  # masked (non-existing landmarks) don't contribute to wingloss # bymc
landmark_loss = self.landmark_loss(landmark, landmark_gt, landmark_mask)*0.1

Hope this is sufficient to cope this problem, please check.