Closed huaileiseu closed 3 years ago
Hi,
There are too many experiment records. If you can provide more details about your settings, I can help you fix this problem.
I use the paramaters got from Table 5 show in paper. The commands for deepfm are following, nohup python2 -u tf_main.py --distributed=False --num_gpus=1 --dataset=avazu --model=deepfm --batch_size=2000 --eval_level=0 --optimizer=adam --learning_rate=1e-3 --embed_size=40 --l2_embed=0 --l2_kernel=0 --num_rounds=20 --nn_layers="[[\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 1]]" > resultDeepFM_avazu.txt 2>&1 &
the best auc: 0.781810 loss:0.378762
and use ln: nohup python2 -u tf_main.py --distributed=False --num_gpus=1 --dataset=avazu --model=deepfm --batch_size=2000 --eval_level=0 --optimizer=adam --learning_rate=1e-3 --embed_size=40 --l2_embed=0 --l2_kernel=0 --num_rounds=20 --nn_layers="[[\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 1]]" > resultDeepFM_avazu.txt 2>&1 &
the best auc: 0.781523 loss:0.378853
nohup python2 -u tf_main.py --distributed=False --num_gpus=1 --dataset=avazu --model=pin --batch_size=2000 --eval_level=0 --optimizer=adam --learning_rate=1e-3 --embed_size=40 --l2_embed=0 --l2_kernel=0 --num_rounds=20 --nn_layers="[[\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 1]]" --sub_nn_layers="[[\"full\", 40], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 5], [\"ln\", \"\"]]" > resultPIN_avazu.txt 2>&1 &
Experiments were repeated three times, the results of them were similiar. Maybe I didn't make good use of "ln". Thanks.
And I have sent an email to you. Thanks for you patience.
Hi,
I have checked my experiment logs, this is deepfm on avazu: I think you can use l2_v = 1e-6 and check if it works.
DeepFM | epoch | decay | l2_v | drop_out | auc | logloss | net | embed | Path | |
---|---|---|---|---|---|---|---|---|---|---|
5 | 0.8 | 1.00E-06 | 0.5 | 0.782641 | 0.378415 | 700*5 | 40 | |||
0.7 | 0.783461 | 0.377938 | ||||||||
0.9 | 0.783198 | 0.377761 | ||||||||
1 | 0.783292 | 0.377685 | ||||||||
1.00E-05 | 0.5 | 0.780782 | 0.379708 | |||||||
0.7 | 0.781389 | 0.37918 | ||||||||
0.9 | 0.781446 | 0.378731 | ||||||||
1 | 0.781914 | 0.378472 | ||||||||
1.00E-04 | 0.5 | 0.778269 | 0.380837 | |||||||
0.7 | 0.778899 | 0.380691 | ||||||||
0.9 | 0.778618 | 0.380525 | ||||||||
1 | 0.779388 | 0.380004 |
And PIN on Avazu:
PNN | batch_size | factor | net | decay | epoch | loss | auc |
---|---|---|---|---|---|---|---|
40,5 | 2000 | 40 | 700*5 | 0.8 | 5 | 0.375942 | 0.786689 |
0.9 | 5 | 0.376159 | 0.786459 | ||||
0.9 | 10 | 0.375764 | 0.786983 | ||||
0.95 | 20 | 0.375625 | 0.786931 | ||||
0.97 | 40 | 0.375527 | 0.787035 | ||||
40,1 | 0.8 | 5 | 0.376006 | 0.786417 | |||
0.9 | 5 | 0.376224 | 0.786596 | ||||
0.9 | 10 | 0.376154 | 0.786679 | ||||
0.95 | 20 | 0.376184 | 0.786484 | ||||
0.97 | 40 | 0.376165 | 0.786259 | ||||
40,10 | 0.8 | 5 | 0.37566 | 0.786841 | |||
0.9 | 5 | 0.376154 | 0.786318 | ||||
0.9 | 10 | 0.375758 | 0.786791 | ||||
0.95 | 20 | 0.375566 | 0.787166 | ||||
0.97 | 40 | 0.375806 | 0.786953 |
I'm sorry for the unclear parameter report in the paper. I did not use l2 regularization on Avazu at first because I found it did not work on other models. And later my second author Bohui helped me to test deepfm, and he used l2 regularization in his experiments. I think there was some misunderstanding when he reported parameters to me. I copy these logs from my excel, so I am sure you can repeat these experiments.
I am sorry I cannot find deepfm logs on Criteo, because those experiments were conducted by Bohui.
Willing to provide further help.
I do some experiments use this repo following the paper, but the auc and loss have a gap between my experiments and the paper, maybe my paramaters are wrong, so can you provide the command lines the paper used. Thanks