Inconsistency in training loss (300W-LP) and testing loss (AFLW2000). What should be the convergence criterion and when to save best model?

Hi @natanielruiz @tfygg , I am training the model again on 300W-LP dataset with filtered filenames. There is high fluctuation in the training loss as mentioned in previous issues as well #6 and #10. Even though for some iteration the losses are very low:

Epoch [25/25], Iter [600/3825] Losses: Yaw 2.5382, Pitch 25.2214, Roll 18.6293 Epoch [25/25], Iter [700/3825] Losses: Yaw 3.4427, Pitch 56.4101, Roll 60.4185 Epoch [25/25], Iter [800/3825] Losses: Yaw 3.8120, Pitch 10.9580, Roll 12.5700 Epoch [25/25], Iter [900/3825] Losses: Yaw 6.2587, Pitch 36.2516, Roll 29.5404 Epoch [25/25], Iter [1000/3825] Losses: Yaw 3.9143, Pitch 13.5918, Roll 11.6238 Epoch [25/25], Iter [1100/3825] Losses: Yaw 2.8406, Pitch 16.2069, Roll 11.7216 Epoch [25/25], Iter [1200/3825] Losses: Yaw 3.1640, Pitch 6.9615, Roll 3.9374 Epoch [25/25], Iter [1300/3825] Losses: Yaw 4.6969, Pitch 8.0815, Roll 9.0429 Epoch [25/25], Iter [1400/3825] Losses: Yaw 3.1008, Pitch 6.8233, Roll 4.4145 Epoch [25/25], Iter [1500/3825] Losses: Yaw 3.5320, Pitch 53.3095, Roll 41.4802 Epoch [25/25], Iter [1600/3825] Losses: Yaw 3.7685, Pitch 7.2890, Roll 8.7627 Epoch [25/25], Iter [1700/3825] Losses: Yaw 3.2166, Pitch 19.6407, Roll 12.9610 Epoch [25/25], Iter [1800/3825] Losses: Yaw 3.6263, Pitch 6.8446, Roll 5.8751 Epoch [25/25], Iter [1900/3825] Losses: Yaw 3.7254, Pitch 12.2385, Roll 9.0497 Epoch [25/25], Iter [2000/3825] Losses: Yaw 4.3334, Pitch 10.8476, Roll 4.3712 Epoch [25/25], Iter [2100/3825] Losses: Yaw 4.8823, Pitch 13.0971, Roll 17.6704 Epoch [25/25], Iter [2200/3825] Losses: Yaw 2.9647, Pitch 5.1831, Roll 5.9912 Epoch [25/25], Iter [2300/3825] Losses: Yaw 2.6243, Pitch 20.3848, Roll 10.7074 Epoch [25/25], Iter [2400/3825] Losses: Yaw 4.3780, Pitch 16.6918, Roll 10.1041 Epoch [25/25], Iter [2500/3825] Losses: Yaw 2.6419, Pitch 29.8599, Roll 23.3731 Epoch [25/25], Iter [2600/3825] Losses: Yaw 3.0582, Pitch 23.6246, Roll 15.0430 Epoch [25/25], Iter [2700/3825] Losses: Yaw 4.5449, Pitch 11.4036, Roll 9.0669 Epoch [25/25], Iter [2800/3825] Losses: Yaw 3.3777, Pitch 6.4258, Roll 4.7266 Epoch [25/25], Iter [2900/3825] Losses: Yaw 4.5212, Pitch 8.0623, Roll 5.5993 Epoch [25/25], Iter [3000/3825] Losses: Yaw 3.5405, Pitch 11.6594, Roll 9.8117 Epoch [25/25], Iter [3100/3825] Losses: Yaw 2.8780, Pitch 10.0156, Roll 9.4295 Epoch [25/25], Iter [3200/3825] Losses: Yaw 3.9240, Pitch 8.4466, Roll 4.5813 Epoch [25/25], Iter [3300/3825] Losses: Yaw 4.6378, Pitch 8.8315, Roll 8.9284

While testing the model on AFLW2000 I am getting slight high error in Yaw. Test error in degrees of the model on the 1969 test images. Yaw: 13.6368, Pitch: 7.7751, Roll: 6.1729 I am saving my model based on best iteration where all errors are minimum.

Can you help me to find out the reason. Following is the command I am using to train the model (I am continuing training over a saved model): train_hopenet.py --data_dir ".\300W_LP" --filename_list "300W_LP_filename_filtered.txt" --snapshot "Pruned_Hopenet_0.5.pth" --batch_size 32 --dataset "Pose_300W_LP" --num_epochs 25 --alpha 1 --output_string "prunedReTrain_0.5_1st" --lr 0.00001

Note : Just my observation, the error for Yaw is less as compare to others while training. However at time of testing on AFLW2000, error in Yaw is largest. Is this because of the dataset? Did you observed anything like that when you trained and tested your model?

natanielruiz / deep-head-pose

Inconsistency in training loss (300W-LP) and testing loss (AFLW2000). What should be the convergence criterion and when to save best model? #131