hexiangnan / attentional_factorization_machine

TensorFlow Implementation of Attentional Factorization Machine
407 stars 156 forks source link

overfitting? or not #5

Closed CanoeFZH closed 6 years ago

CanoeFZH commented 6 years ago

python AFM.py --dataset ml-tag --epoch 20 --pretrain 0 --batch_size 4096 --hidden_factor '[16,16]' --keep '[1.0,0.5]' --lamda_attention 100.0 --lr 0.1

params: 1537870

Init:    train=1.0000, validation=1.0000 [3.2 s]
Epoch 1 [4.3 s] train=0.5210, validation=0.5746 [4.4 s]
Epoch 2 [4.4 s] train=0.4683, validation=0.5514 [4.3 s]
Epoch 3 [4.8 s] train=0.4326, validation=0.5401 [4.5 s]
Epoch 4 [5.0 s] train=0.4055, validation=0.5321 [4.7 s]
Epoch 5 [5.4 s] train=0.3874, validation=0.5274 [5.2 s]
Epoch 6 [5.0 s] train=0.3712, validation=0.5232 [5.3 s]
Epoch 7 [5.0 s] train=0.3558, validation=0.5204 [4.9 s]
Epoch 8 [5.3 s] train=0.3460, validation=0.5181 [4.9 s]
Epoch 9 [5.4 s] train=0.3375, validation=0.5163 [5.1 s]
Epoch 10 [5.4 s]    train=0.3287, validation=0.5151 [5.0 s]
Epoch 11 [5.4 s]    train=0.3242, validation=0.5136 [5.0 s]
Epoch 12 [5.1 s]    train=0.3168, validation=0.5126 [5.5 s]
Epoch 13 [5.1 s]    train=0.3119, validation=0.5118 [5.1 s]
Epoch 14 [5.5 s]    train=0.3074, validation=0.5113 [5.1 s]
Epoch 15 [5.5 s]    train=0.3068, validation=0.5106 [5.2 s]
Epoch 16 [5.5 s]    train=0.3022, validation=0.5103 [5.3 s]
Epoch 17 [5.6 s]    train=0.2984, validation=0.5098 [5.3 s]
Epoch 18 [5.2 s]    train=0.2964, validation=0.5092 [5.7 s]
Epoch 19 [5.3 s]    train=0.2934, validation=0.5093 [5.3 s]
Epoch 20 [5.7 s]    train=0.2908, validation=0.5089 [5.3 s]
hexiangnan commented 6 years ago

As you can see the convergence has slowed down, it is overfitting under this set of params. Try these:

1: enlarge the hidden_factor(e.g. [32,256]) as it improve the capability of the model;

2: train AFM based on the pretrained parameters from FM.

Here are my results:

1: python AFM.py --dataset ml-tag --epoch 10 --pretrain 0 --batch_size 4096 --hidden_factor '[32,256]' --keep '[1.0,0.5]' --lamda_attention 100.0 --lr 0.1

Init: train=1.0000, validation=1.0000 [9.7 s] Epoch 1 [17.9 s] train=0.3919, validation=0.5203 [10.1 s] Epoch 2 [17.0 s] train=0.2870, validation=0.4870 [9.8 s] Epoch 3 [16.8 s] train=0.2481, validation=0.4813 [10.0 s] Epoch 4 [18.4 s] train=0.1992, validation=0.4663 [10.3 s] Epoch 5 [19.4 s] train=0.1705, validation=0.4588 [10.3 s] Epoch 6 [18.1 s] train=0.1565, validation=0.4552 [10.1 s] Epoch 7 [16.9 s] train=0.1403, validation=0.4509 [9.1 s] Epoch 8 [17.2 s] train=0.1405, validation=0.4514 [9.8 s] Epoch 9 [18.1 s] train=0.1257, validation=0.4480 [10.0 s] Epoch 10 [17.3 s] train=0.1238, validation=0.4474 [8.9 s]

2: python AFM.py --dataset ml-tag --epoch 10 --pretrain 1 --batch_size 4096 --hidden_factor '[16,16]' --keep '[1.0,0.5]' --lamda_attention 100.0 --lr 0.1

Init: train=0.7103, validation=0.7238 [7.3 s] Epoch 1 [9.0 s] train=0.4867, validation=0.5594 [8.4 s] Epoch 2 [9.9 s] train=0.4363, validation=0.5403 [7.8 s] Epoch 3 [8.1 s] train=0.4031, validation=0.5307 [8.3 s] Epoch 4 [9.8 s] train=0.3796, validation=0.5238 [7.6 s] cEpoch 5 [9.6 s] train=0.3622, validation=0.5192 [8.7 s] Epoch 6 [9.0 s] train=0.3476, validation=0.5150 [8.2 s] Epoch 7 [9.6 s] train=0.3366, validation=0.5126 [7.2 s] Epoch 8 [8.8 s] train=0.3263, validation=0.5108 [8.1 s] Epoch 9 [9.8 s] train=0.3186, validation=0.5089 [8.2 s] Epoch 10 [9.1 s] train=0.3104, validation=0.5072 [8.9 s]

CanoeFZH commented 6 years ago

OK, but the training loss and validation loss gap is still huge I think.

hexiangnan commented 6 years ago

Yes.