What‘s paramaters the paper used？

huaileiseu commented 5 years ago

I do some experiments use this repo following the paper, but the auc and loss have a gap between my experiments and the paper, maybe my paramaters are wrong, so can you provide the command lines the paper used. Thanks

Atomu2014 commented 5 years ago

Hi,

There are too many experiment records. If you can provide more details about your settings, I can help you fix this problem.

huaileiseu commented 5 years ago

I use the paramaters got from Table 5 show in paper. The commands for deepfm are following, nohup python2 -u tf_main.py --distributed=False --num_gpus=1 --dataset=avazu --model=deepfm --batch_size=2000 --eval_level=0 --optimizer=adam --learning_rate=1e-3 --embed_size=40 --l2_embed=0 --l2_kernel=0 --num_rounds=20 --nn_layers="[[\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 1]]" > resultDeepFM_avazu.txt 2>&1 &

the best auc: 0.781810 loss:0.378762

and use ln: nohup python2 -u tf_main.py --distributed=False --num_gpus=1 --dataset=avazu --model=deepfm --batch_size=2000 --eval_level=0 --optimizer=adam --learning_rate=1e-3 --embed_size=40 --l2_embed=0 --l2_kernel=0 --num_rounds=20 --nn_layers="[[\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 1]]" > resultDeepFM_avazu.txt 2>&1 &

the best auc: 0.781523 loss:0.378853

nohup python2 -u tf_main.py --distributed=False --num_gpus=1 --dataset=avazu --model=pin --batch_size=2000 --eval_level=0 --optimizer=adam --learning_rate=1e-3 --embed_size=40 --l2_embed=0 --l2_kernel=0 --num_rounds=20 --nn_layers="[[\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 1]]" --sub_nn_layers="[[\"full\", 40], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 5], [\"ln\", \"\"]]" > resultPIN_avazu.txt 2>&1 &

Experiments were repeated three times, the results of them were similiar. Maybe I didn't make good use of "ln". Thanks.

huaileiseu commented 5 years ago

And I have sent an email to you. Thanks for you patience.

Atomu2014 commented 5 years ago

Hi,

I have checked my experiment logs, this is deepfm on avazu: I think you can use l2_v = 1e-6 and check if it works.

epoch	decay	l2_v	drop_out	auc	logloss	net	embed
5	0.8	1.00E-06	0.5	0.782641	0.378415	700*5	40
	0.7	0.783461	0.377938
	0.9	0.783198	0.377761
	1	0.783292	0.377685
	1.00E-05	0.5	0.780782	0.379708
	0.7	0.781389	0.37918
	0.9	0.781446	0.378731
	1	0.781914	0.378472
	1.00E-04	0.5	0.778269	0.380837
	0.7	0.778899	0.380691
	0.9	0.778618	0.380525
	1	0.779388	0.380004

And PIN on Avazu:

PNN	batch_size	factor	net	decay	epoch	loss	auc
40,5	2000	40	700*5	0.8	5	0.375942	0.786689
				0.9	5	0.376159	0.786459
				0.9	10	0.375764	0.786983
				0.95	20	0.375625	0.786931
				0.97	40	0.375527	0.787035
40,1				0.8	5	0.376006	0.786417
				0.9	5	0.376224	0.786596
				0.9	10	0.376154	0.786679
				0.95	20	0.376184	0.786484
				0.97	40	0.376165	0.786259
40,10				0.8	5	0.37566	0.786841
				0.9	5	0.376154	0.786318
				0.9	10	0.375758	0.786791
				0.95	20	0.375566	0.787166
				0.97	40	0.375806	0.786953

I'm sorry for the unclear parameter report in the paper. I did not use l2 regularization on Avazu at first because I found it did not work on other models. And later my second author Bohui helped me to test deepfm, and he used l2 regularization in his experiments. I think there was some misunderstanding when he reported parameters to me. I copy these logs from my excel, so I am sure you can repeat these experiments.

I am sorry I cannot find deepfm logs on Criteo, because those experiments were conducted by Bohui.

Willing to provide further help.

Atomu2014 / product-nets-distributed

What‘s paramaters the paper used？ #1