ShiqiYu / libfacedetection.train

The training program for libfacedetection for face detection and 5-landmark detection.
Apache License 2.0
766 stars 209 forks source link

Apply for priors box parameters of 'onnx' model #55

Closed enemy1205 closed 1 year ago

enemy1205 commented 1 year ago

In file config/yufacedet.yaml

  anchor:
    min_sizes: [[10, 16, 24], [32, 48], [64, 96], [128, 192, 256]]
    steps: [8, 16, 32, 64]
    ratio: [1.]
    clip: False

These parameters are based on the weights/yunet_final.pth model . But when i use the model with a fixed input size eg.onnx/yunet_yunet_final_320_320_simplify.onnx , this set of parameters will not match. I wish to get a set of parameters which is able to adapt to the onnx model. Thanks

fengyuentau commented 1 year ago

@Wwupup Please take a look.

Wwupup commented 1 year ago

Hello, I just tested this model and found no problems you said. Please confirm whether the input size is scaled to 320x320. My test code is as follows: python tools/compare_inference.py ./onnx/yunet_yunet_final_320_320_simplify.onnx --image test.jpg --mode 320,320

enemy1205 commented 1 year ago

However,according to this code

class PriorBox(object):
    def __init__(self, min_sizes, steps, clip, ratio):
        super(PriorBox, self).__init__()
        self.min_sizes = min_sizes
        self.steps = steps
        self.clip = clip
        self.ratio = ratio
    def __call__(self, image_size):
        # given 320

        feature_map_2th = [int(int((image_size[0] + 1) / 2) / 2),
                                int(int((image_size[1] + 1) / 2) / 2)]
        # 2th 80
        feature_map_3th = [int(feature_map_2th[0] / 2),
                                int(feature_map_2th[1] / 2)]
        # 3th 40
        feature_map_4th = [int(feature_map_3th[0] / 2),
                                int(feature_map_3th[1] / 2)]
        # 4th 20
        feature_map_5th = [int(feature_map_4th[0] / 2),
                                int(feature_map_4th[1] / 2)]
        # 5th 10
        feature_map_6th = [int(feature_map_5th[0] / 2),
                                int(feature_map_5th[1] / 2)]
        # 6th 5

        feature_maps = [feature_map_3th, feature_map_4th,
                             feature_map_5th, feature_map_6th]
        anchors = []
        for k, f in enumerate(feature_maps):
            min_sizes = self.min_sizes[k]
            for i, j in product(range(f[0]), range(f[1])):
                for min_size in min_sizes:
                    cx = (j + 0.5) * self.steps[k] / image_size[1]
                    cy = (i + 0.5) * self.steps[k] / image_size[0]
                    for r in self.ratio:
                        s_ky = min_size / image_size[0]
                        s_kx = r * min_size / image_size[1]
                        anchors += [cx, cy, s_kx, s_ky]
        # back to torch land
        output = torch.Tensor(anchors).view(-1, 4)
        if self.clip:
            output.clamp_(max=1, min=0)
        return output

feature_maps will be [40,20,10,5] As min_sizes: [[10, 16, 24], [32, 48], [64, 96], [128, 192, 256]] And 40x40+20x20+10x10+5x5=2125 ,but the number of priors in output of the onnx model is 5875, it's not divisible by 2125('x' stands for multiplication as pairs of '*' are recognized by markdown as italics) I can't understand this thought I know your code should be fine。

Wwupup commented 1 year ago

Hello, As feature_maps=[40, 20, 10, 5] with min_sizes: [[10, 16, 24], [32, 48], [64, 96], [128, 192, 256]], the anchor_num=40x40x3 + 20x20x2 + 10x10x2 + 5x5x3 = 5875

enemy1205 commented 1 year ago

Thanks for reply , since I'm rewriting the C++ version of inference in TensorRT, I may not be familiar with this kind of multiple traversal.