thohemp / 6DRepNet

Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.
MIT License
550 stars 72 forks source link

Preprocess part in train.py code and demo.py are different #4

Closed YaoQ closed 2 years ago

YaoQ commented 2 years ago

Thanks for your good job.

I try to test and train the 6DRepNet model, and find some issue.

  1. Preprocess code in train.py

    normalize = transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225])
    
    transformations = transforms.Compose([transforms.Resize(240),
                                          transforms.RandomCrop(224),
                                          transforms.ToTensor(),
                                          normalize])

    Preprocess code in demo.py

                img = frame[y_min:y_max,x_min:x_max]
               # cv2.imshow("crop", img)
               # cv2.waitKey(5)
                img = cv2.resize(img, (244, 244))/255.0
                img = img.transpose(2, 0, 1)
                img = torch.from_numpy(img).type(torch.FloatTensor)
                img = torch.Tensor(img).cuda(gpu)

    normalize and input size are different.

  2. I download the pre-trained RepVGG model 'RepVGG-A0-train.pth' from here

Just use demo.py code to test 9 faces with one image, output are wrong. 9 faces have same yaw, row and pitch valudes.

and I also test the Fine-trained models from here, the pose values look well.

So what are the difference between pre-trained RepVGG model and Fine-trained models?

thohemp commented 2 years ago
  1. Thanks, good catch.
  2. We use RepVGG as network backbone. (https://arxiv.org/abs/2101.03697, https://github.com/DingXiaoH/RepVGG). It also provides pre-trained models that are trained on ImageNet for general image classification tasks. Even though they are not explicitly trained for the head pose estimation task they already provide basic feature extraction skills, that are used to fine-tune these models for the head pose estimation task.
    Example

So, when you want to train 6DRepNet using the pre-trained models from RepVGG instead of starting from scratch will improve your results. The fine-tuned models that we are providing are already trained for head pose estimation and can be used for demo/inference purposes.

YaoQ commented 2 years ago

Get it, and thanks for your response.