natanielruiz / deep-head-pose

:fire::fire: Deep Learning Head Pose Estimation using PyTorch.
Other
1.58k stars 368 forks source link

Data preprocessing for 300W_LP and AFLW2000. #89

Open CrossEntropy opened 4 years ago

CrossEntropy commented 4 years ago

Hi!@natanielruiz, Great papers and work! I have a question about data preprocessing, I performed well on the training set, but poorly on the test set. Since I am training a small network, I will limit the input image resolution to 56x56. In 300W_LP, I used your method.

img = Image.open(path)
x_min, y_min = float(cor[index, 0]), float(cor[index, 1])
x_max, y_max = float(cor[index, 2]), float(cor[index, 3])
k = np.random.random_sample() * 0.2 + 0.2
x_min -= 0.6 * k * abs(x_max - x_min)
y_min -= 2 * k * abs(y_max - y_min)
x_max += 0.6 * k * abs(x_max - x_min)
y_max += 0.6 * k * abs(y_max - y_min)
img = img.crop((int(x_min), int(y_min), int(x_max), int(y_max)))
prob = np.random.random_sample()
if prob < 0.5:
    yaws[index] = -yaws[index]
    bins[index] = nums - 1 - bins[index]
    img = img.transpose(Image.FLIP_LEFT_RIGHT)  
prob = np.random.random_sample()
if prob < 0.05:
    img = img.filter(ImageFilter.BLUR)

Finally I will use the bilinear difference method to compress the image size to 56 * 56.

img = img.resize((56, 56), resample=Image.BILINEAR)

Similarly, aflw2000 also follows the above process.

img = Image.open(path)
x_min, y_min = float(cor[index, 0]), float(cor[index, 1])
x_max, y_max = float(cor[index, 2]), float(cor[index, 3])
k = 0.20
x_min -= 2 * k * abs(x_max - x_min)
y_min -= 2 * k * abs(y_max - y_min)
x_max += 2 * k * abs(x_max - x_min)
y_max += 0.6 * k * abs(y_max - y_min)
img = img.crop((int(x_min), int(y_min), int(x_max), int(y_max)))
img = img.resize((56, 56), resample=Image.BILINEAR)

The following figure is the MAE curve of my train, valid, test dataset, Why is my test set so bad? Is it because my data is processed incorrectly? Thanks for your help! git

chuzcjoe commented 4 years ago

have you solved the issue? We have the same problem here as well.

CrossEntropy commented 4 years ago

Hi, @chuzcjoe Sorry to reply to you so late, I've been busy doing something else lately... In the AFLW2000 landmarks, some points have a minimum value less than 0 and a maximum value greater than width or height. You need to clip them. Hope I can help you!

wqz960 commented 3 years ago

@CrossEntropy I train the model, but it does not convergence, have you met this problem?

CrossEntropy commented 3 years ago

@wqz960 I don’t seem to encounter this problem. Have you cleaned the data set correctly?

wqz960 commented 3 years ago

@CrossEntropy I did not clean the data, use all the pictures(about 30K) for training. the lowest the losses of yaw, pitch and roll are about +-1.5, and I evaluate the model on AFLW2000, the result is bad, 7.5 for yaw, 10 for pitch, 10 for roll, which has a difference in paper, can you give me some advice? Thank you! My wechat is zz362379625.

CrossEntropy commented 3 years ago

@wqz960 In the AFLW2000 landmarks, some points have a minimum value less than 0 and a maximum value greater than width or height. You need to clip them.

Susiehub commented 2 years ago

How do you get Headpose(Yaw, pitch, roll) from AFLW200 & 300W Datasets??

I have downloaded the data bases. I am trying to process the data and labels. I don't understand where did you get Euler head pose from these datasets.?

For example, I checked one of the .mat file given for a sample. It has a pose_param with 1x7 array. How to get Euler's from here ? 0.085767917 -0.095569924 0.075868770 220.44441 167.35301 -100.53674 0.0012647437

cunesewangst commented 2 years ago

Hi,how to train the network correctly I use 300W-LP data set for training, but the loss value has not changed much and can not converge. The following figure shows the result of 300w-lp data set with batch size of 64 and 3 epochs 你好,怎么正确地训练网络? 我使用300W-LP数据集进行训练,但LOSS值一直变化不大,也无法收敛。 下图是300W-LP数据集,batch-size是64,3个epoch的结果 image

想问您一下那个训练的命令具体是怎么写的呀,关于snapshot这个是参数是指什么呀,感谢

cunesewangst commented 2 years ago

@kellen5l 想问您一下那个训练的命令具体是怎么写的呀,关于snapshot这个是参数是指什么呀,感谢

kellen5l commented 2 years ago

@cunesewangst Sorry. I completely forgot.

lovegit2021 commented 1 year ago

Hi,how to train the network correctly I use 300W-LP data set for training, but the loss value has not changed much and can not converge. The following figure shows the result of 300w-lp data set with batch size of 64 and 3 epochs 你好,怎么正确地训练网络? 我使用300W-LP数据集进行训练,但LOSS值一直变化不大,也无法收敛。 下图是300W-LP数据集,batch-size是64,3个epoch的结果 image

想问您一下那个训练的命令具体是怎么写的呀,关于snapshot这个是参数是指什么呀,感谢 @cunesewangst 请问你的最后收敛了吗