You design a very useful and light model for head pose estimation, but you also use some augmentations.
In my experiment, a very simple network can achieve the test MAE in the same dataset you use once I use RandomCrop. I did lots of experiments using the same dataset you use with pytorch, I got almost the same MAE published in your paper.
You design a very useful and light model for head pose estimation, but you also use some augmentations.
In my experiment, a very simple network can achieve the test MAE in the same dataset you use once I use RandomCrop. I did lots of experiments using the same dataset you use with pytorch, I got almost the same MAE published in your paper.
So maybe data augmentation is the key?
This is my simple network structure: Network( (conv1): Sequential( (0): Conv2d(3, 8, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=8) ) (conv2_1): Sequential( (0): Conv2d(8, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=8) ) (conv2_2): Sequential( (0): Conv2d(8, 8, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=8) ) (conv2): Sequential( (0): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=16) ) (conv3_1): Sequential( (0): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=16) ) (conv3_2): Sequential( (0): Conv2d(16, 16, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=16) ) (conv3): Sequential( (0): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=32) ) (conv4_1): Sequential( (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=32) ) (conv4_2): Sequential( (0): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=32) ) (conv4): Sequential( (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=64) ) (conv5_1): Sequential( (0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=64) ) (conv5_2): Sequential( (0): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1)) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=64) ) (conv5): Sequential( (0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): PReLU(num_parameters=128) ) (fc): Linear(in_features=1152, out_features=128, bias=True) (relu): PReLU(num_parameters=128) (classifier): Linear(in_features=128, out_features=3, bias=True) )