qiaoguan / deep-ctr-prediction

CTR prediction models based on deep learning(基于深度学习的广告推荐CTR预估模型)
https://github.com/qiaoguan/deep-ctr-prediction
907 stars 276 forks source link

esmm训练loss一直在400左右震荡,什么原因? #12

Closed sunjiadong closed 4 years ago

sunjiadong commented 4 years ago

训练 INFO:tensorflow:global_step/sec: 12.3102 I0511 16:43:03.123719 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 12.3102 INFO:tensorflow:loss = 472.53354, step = 38300 (8.123 sec) I0511 16:43:03.124620 139644774647616 basic_session_run_hooks.py:260] loss = 472.53354, step = 38300 (8.123 sec) INFO:tensorflow:global_step/sec: 13.4449 I0511 16:43:10.561764 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 13.4449 INFO:tensorflow:loss = 496.88828, step = 38400 (7.439 sec) I0511 16:43:10.563358 139644774647616 basic_session_run_hooks.py:260] loss = 496.88828, step = 38400 (7.439 sec) INFO:tensorflow:global_step/sec: 13.5721 I0511 16:43:17.929780 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 13.5721 INFO:tensorflow:loss = 494.92902, step = 38500 (7.368 sec) I0511 16:43:17.931165 139644774647616 basic_session_run_hooks.py:260] loss = 494.92902, step = 38500 (7.368 sec) INFO:tensorflow:global_step/sec: 10.366 I0511 16:43:27.576712 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 10.366 INFO:tensorflow:loss = 477.77087, step = 38600 (9.647 sec) I0511 16:43:27.578247 139644774647616 basic_session_run_hooks.py:260] loss = 477.77087, step = 38600 (9.647 sec) INFO:tensorflow:global_step/sec: 13.6566 I0511 16:43:34.899176 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 13.6566 INFO:tensorflow:loss = 469.46252, step = 38700 (7.322 sec) I0511 16:43:34.900484 139644774647616 basic_session_run_hooks.py:260] loss = 469.46252, step = 38700 (7.322 sec) INFO:tensorflow:global_step/sec: 14.8222 I0511 16:43:41.645576 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 14.8222 INFO:tensorflow:loss = 505.62067, step = 38800 (6.746 sec) I0511 16:43:41.646508 139644774647616 basic_session_run_hooks.py:260] loss = 505.62067, step = 38800 (6.746 sec) INFO:tensorflow:global_step/sec: 14.7337 I0511 16:43:48.432974 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 14.7337 INFO:tensorflow:loss = 508.70572, step = 38900 (6.788 sec) I0511 16:43:48.434319 139644774647616 basic_session_run_hooks.py:260] loss = 508.70572, step = 38900 (6.788 sec) INFO:tensorflow:global_step/sec: 14.245 I0511 16:43:55.452730 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 14.245 INFO:tensorflow:loss = 481.75873, step = 39000 (7.019 sec) I0511 16:43:55.453657 139644774647616 basic_session_run_hooks.py:260] loss = 481.75873, step = 39000 (7.019 sec) INFO:tensorflow:global_step/sec: 14.1653 I0511 16:44:02.512451 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 14.1653 INFO:tensorflow:loss = 492.90146, step = 39100 (7.060 sec) I0511 16:44:02.513763 139644774647616 basic_session_run_hooks.py:260] loss = 492.90146, step = 39100 (7.060 sec) INFO:tensorflow:global_step/sec: 13.9005 I0511 16:44:09.706491 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 13.9005 INFO:tensorflow:loss = 481.75992, step = 39200 (7.194 sec) I0511 16:44:09.708160 139644774647616 basic_session_run_hooks.py:260] loss = 481.75992, step = 39200 (7.194 sec) INFO:tensorflow:global_step/sec: 13.2426 I0511 16:44:17.257735 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 13.2426 INFO:tensorflow:loss = 490.50104, step = 39300 (7.551 sec) I0511 16:44:17.259049 139644774647616 basic_session_run_hooks.py:260] loss = 490.50104, step = 39300 (7.551 sec) INFO:tensorflow:global_step/sec: 10.1131 I0511 16:44:27.145826 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 10.1131 INFO:tensorflow:loss = 478.1643, step = 39400 (9.888 sec) I0511 16:44:27.146829 139644774647616 basic_session_run_hooks.py:260] loss = 478.1643, step = 39400 (9.888 sec) INFO:tensorflow:global_step/sec: 10.4625 I0511 16:44:36.704025 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 10.4625 INFO:tensorflow:loss = 469.9007, step = 39500 (9.559 sec) I0511 16:44:36.705528 139644774647616 basic_session_run_hooks.py:260] loss = 469.9007, step = 39500 (9.559 sec) INFO:tensorflow:global_step/sec: 14.7374 I0511 16:44:43.489141 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 14.7374 INFO:tensorflow:loss = 481.07245, step = 39600 (6.785 sec) I0511 16:44:43.490077 139644774647616 basic_session_run_hooks.py:260] loss = 481.07245, step = 39600 (6.785 sec) INFO:tensorflow:global_step/sec: 11.0079 I0511 16:44:52.573785 139644774647616 basic_session_run_hooks.py:692] global_step/sec: 11.0079

最终的auc ctr_accuracy: 0.62085813 ctr_auc: 0.6696427 cvr_accuracy: 0.9009072 cvr_auc: 0.67000365 global_step: 40000 loss: 488.3895

qiaoguan commented 4 years ago

是不是对于你的数据集已经收敛了? 你可以分析下你的训练集的数据分布(ctr,cvr正负样本的比例等),调调学习率,ctr, cvr loss 占总loss的权重之类的试试

sunjiadong commented 4 years ago

好的,可能是cvr的正负比例太高,ctr是按照1:2的正负比例来的,我调整下样本

sunjiadong commented 4 years ago

ctr是按照1:2的正负比例,cvr按照1:5的正负比例,这样可以吗?

qiaoguan commented 4 years ago

并不是说一定要保证你的ctr ,cvr样本的正负比例是多少, 具体要不要采样, 或者采样比例设为多少效果比较好, 你可以在你的样本上去试几下,找找规律看看