Atomu2014 / product-nets-distributed

distributed version of product-nets
81 stars 25 forks source link

What‘s paramaters the paper used? #1

Closed huaileiseu closed 3 years ago

huaileiseu commented 5 years ago

I do some experiments use this repo following the paper, but the auc and loss have a gap between my experiments and the paper, maybe my paramaters are wrong, so can you provide the command lines the paper used. Thanks

Atomu2014 commented 5 years ago

Hi,

There are too many experiment records. If you can provide more details about your settings, I can help you fix this problem.

huaileiseu commented 5 years ago

I use the paramaters got from Table 5 show in paper. The commands for deepfm are following, nohup python2 -u tf_main.py --distributed=False --num_gpus=1 --dataset=avazu --model=deepfm --batch_size=2000 --eval_level=0 --optimizer=adam --learning_rate=1e-3 --embed_size=40 --l2_embed=0 --l2_kernel=0 --num_rounds=20 --nn_layers="[[\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 1]]" > resultDeepFM_avazu.txt 2>&1 &

the best auc: 0.781810 loss:0.378762

and use ln: nohup python2 -u tf_main.py --distributed=False --num_gpus=1 --dataset=avazu --model=deepfm --batch_size=2000 --eval_level=0 --optimizer=adam --learning_rate=1e-3 --embed_size=40 --l2_embed=0 --l2_kernel=0 --num_rounds=20 --nn_layers="[[\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 500], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 1]]" > resultDeepFM_avazu.txt 2>&1 &

the best auc: 0.781523 loss:0.378853

nohup python2 -u tf_main.py --distributed=False --num_gpus=1 --dataset=avazu --model=pin --batch_size=2000 --eval_level=0 --optimizer=adam --learning_rate=1e-3 --embed_size=40 --l2_embed=0 --l2_kernel=0 --num_rounds=20 --nn_layers="[[\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 500], [\"act\", \"relu\"], [\"full\", 1]]" --sub_nn_layers="[[\"full\", 40], [\"ln\", \"\"], [\"act\", \"relu\"], [\"full\", 5], [\"ln\", \"\"]]" > resultPIN_avazu.txt 2>&1 &

Experiments were repeated three times, the results of them were similiar. Maybe I didn't make good use of "ln". Thanks.

huaileiseu commented 5 years ago

And I have sent an email to you. Thanks for you patience.

Atomu2014 commented 5 years ago

Hi,

I have checked my experiment logs, this is deepfm on avazu: I think you can use l2_v = 1e-6 and check if it works.

DeepFM epoch decay l2_v drop_out auc logloss net embed Path  
  5 0.8 1.00E-06 0.5 0.782641 0.378415 700*5 40  
  0.7 0.783461 0.377938    
  0.9 0.783198 0.377761    
  1 0.783292 0.377685    
  1.00E-05 0.5 0.780782 0.379708    
  0.7 0.781389 0.37918    
  0.9 0.781446 0.378731    
  1 0.781914 0.378472    
  1.00E-04 0.5 0.778269 0.380837    
  0.7 0.778899 0.380691    
  0.9 0.778618 0.380525    
  1 0.779388 0.380004  

And PIN on Avazu:

PNN batch_size factor net decay epoch loss auc
40,5 2000 40 700*5 0.8 5 0.375942 0.786689
        0.9 5 0.376159 0.786459
        0.9 10 0.375764 0.786983
        0.95 20 0.375625 0.786931
        0.97 40 0.375527 0.787035
40,1     0.8 5 0.376006 0.786417
        0.9 5 0.376224 0.786596
        0.9 10 0.376154 0.786679
        0.95 20 0.376184 0.786484
        0.97 40 0.376165 0.786259
40,10       0.8 5 0.37566 0.786841
        0.9 5 0.376154 0.786318
        0.9 10 0.375758 0.786791
        0.95 20 0.375566 0.787166
        0.97 40 0.375806 0.786953

I'm sorry for the unclear parameter report in the paper. I did not use l2 regularization on Avazu at first because I found it did not work on other models. And later my second author Bohui helped me to test deepfm, and he used l2 regularization in his experiments. I think there was some misunderstanding when he reported parameters to me. I copy these logs from my excel, so I am sure you can repeat these experiments.

I am sorry I cannot find deepfm logs on Criteo, because those experiments were conducted by Bohui.

Willing to provide further help.