thepowerfuldeez / facemesh.pytorch

This is the PyTorch implementation of paper Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs (https://arxiv.org/pdf/1907.06724.pdf)
Apache License 2.0
287 stars 64 forks source link

Any idea how we can train facemesh model with our own dataset? #7

Open sainisanjay opened 1 year ago

sainisanjay commented 1 year ago

Thanks for wonderful work. However, i would like to train this facemesh model with our own dataset which has 98 keypoints. Could you please help me how we can start? What type of loss function we should use as we don't know what mediapipe has used during their training. Although you have already mentioned that we should as the BatchNorm layer for fine tuning. Could you help me which places we should use BatchNorm. In both FaceMeshBlock and FaceMesh block?

showbit01 commented 1 year ago

Thanks for wonderful work. However, i would like to train this facemesh model with our own dataset which has 98 keypoints. Could you please help me how we can start? What type of loss function we should use as we don't know what mediapipe has used during their training. Although you have already mentioned that we should as the BatchNorm layer for fine tuning. Could you help me which places we should use BatchNorm. In both FaceMeshBlock and FaceMesh block?

Have you got the answers to this

sainisanjay commented 1 year ago

@showbit01 Not really, but i was able to train the model by adding BN layer and PFLDLoss, but results are not so good as compare to other state-of-the-art models.

PFLDLOSS:

class PFLDLoss(nn.Module):
    def __init__(self):
        super(PFLDLoss, self).__init__()

    def forward(self, attribute_gt, landmark_gt, landmarks, train_batchsize):
        attributes_w_n = attribute_gt[:, 1:6].float()
        mat_ratio = torch.mean(attributes_w_n, axis=0)
        mat_ratio = torch.Tensor([
            1.0 / (x) if x > 0 else train_batchsize for x in mat_ratio
        ]).to(device)
        weight_attribute = torch.sum(attributes_w_n.mul(mat_ratio), axis=1)
        l2_distant = torch.sum((landmark_gt - landmarks) * (landmark_gt - landmarks), axis=1)
        return torch.mean(weight_attribute * l2_distant), torch.mean(l2_distant)
showbit01 commented 1 year ago

@showbit01 Not really, but i was able to train the model by adding BN layer and PFLDLoss, but results are not so good as compare to other state-of-the-art models.

PFLDLOSS:

class PFLDLoss(nn.Module):
    def __init__(self):
        super(PFLDLoss, self).__init__()

    def forward(self, attribute_gt, landmark_gt, landmarks, train_batchsize):
        attributes_w_n = attribute_gt[:, 1:6].float()
        mat_ratio = torch.mean(attributes_w_n, axis=0)
        mat_ratio = torch.Tensor([
            1.0 / (x) if x > 0 else train_batchsize for x in mat_ratio
        ]).to(device)
        weight_attribute = torch.sum(attributes_w_n.mul(mat_ratio), axis=1)
        l2_distant = torch.sum((landmark_gt - landmarks) * (landmark_gt - landmarks), axis=1)
        return torch.mean(weight_attribute * l2_distant), torch.mean(l2_distant)

Okay that's good but i want to understand tha whole training process as this is not mentioned much in paper,would you give the whole illustration of the supervised learning here