FeiSun / BERT4Rec

BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
Apache License 2.0
599 stars 155 forks source link

Code on Ml-20m can't achieve the same performance as in paper #13

Open outside-BUPT opened 3 years ago

outside-BUPT commented 3 years ago

my result: hit@1:0.2399, ndcg@5:0.393, hit@5:0.536, ndcg@10:0.4379=, hit@10:0.6718, ap:0.3794

result from paper: hit@1:0.3440 , ndcg@5:0.4967, hit@5:0.6323, ndcg@10:0.5340, hit@10:0.7473, ap:0.4785

asash commented 2 years ago

@outside-BUPT how many steps did you train the model?

oscarriddle commented 11 months ago

I tried to run run_ml-1m.sh and also got bad evaluation performance as below. I noticed the script sets train steps to 400000.

dcg@1:0.016556291390728478, hit@1:0.016556291390728478, ndcg@5:0.041968522744744094, hit@5:0.06837748344370861, ndcg@10:0.06059810508379426, hit@10:0.12682119205298012, ap:0.06324608749550562, valid_user:6040.0 INFO:tensorflow:Inference Time : 27.27113s I1007 17:31:45.042668 140571207984832 evaluation.py:269] Inference Time : 27.27113s INFO:tensorflow:Finished evaluation at 2023-10-07-17:31:45 I1007 17:31:45.043014 140571207984832 evaluation.py:271] Finished evaluation at 2023-10-07-17:31:45 INFO:tensorflow:Saving dict for global step 400000: global_step = 400000, loss = 9.357097, masked_lm_accuracy = 0.0013245033, masked_lm_loss = 9.3572

asash commented 11 months ago

@oscarriddle The default number of training steps in this script needs to be higher to reproduce the reported result. If you want to get closer to the reported results, you'll have to increase the number of steps by a factor of 20 (e. g. 8M or even more), but the training will take a lot of time.

See our reproducibility paper on this https://browse.arxiv.org/pdf/2207.07483.pdf

oscarriddle commented 11 months ago

@oscarriddle The default number of training steps in this script needs to be higher to reproduce the reported result. If you want to get closer to the reported results, you'll have to increase the number of steps by a factor of 20 (e. g. 8M or even more), but the training will take a lot of time.

See our reproducibility paper on this https://browse.arxiv.org/pdf/2207.07483.pdf

After training about 35,000,000 steps, still wandering around a rather low saddle point, merely above SASrec's performance mentioned in paper. ............................................................ndcg@1:0.24486754966887417, hit@1:0.24486754966887417, ndcg@5:0.4067822076254277, hit@5:0.5536423841059602, ndcg@10:0.44381509752787934, hit@10:0.6673841059602649, ap:0.38591140681367797, valid_user:6040.0 ............................................................ndcg@1:0.24519867549668875, hit@1:0.24519867549668875, ndcg@5:0.40733902240485165, hit@5:0.5538079470198676, ndcg@10:0.44420200049712466, hit@10:0.666887417218543, ap:0.3865876645860639, valid_user:6040.0 ............................................................ndcg@1:0.24486754966887417, hit@1:0.24486754966887417, ndcg@5:0.4067814130590064, hit@5:0.5524834437086092, ndcg@10:0.4443041962387484, hit@10:0.6675496688741722, ap:0.3864707774135114, valid_user:6040.0 ............................................................ndcg@1:0.24519867549668875, hit@1:0.24519867549668875, ndcg@5:0.40754205560187584, hit@5:0.5539735099337748, ndcg@10:0.4447855702971012, hit@10:0.6683774834437086, ap:0.38679013575800086, valid_user:6040.0 ............................................................ndcg@1:0.24437086092715232, hit@1:0.24437086092715232, ndcg@5:0.40725909975358604, hit@5:0.5543046357615894, ndcg@10:0.4443552628363676, hit@10:0.6683774834437086, ap:0.38621019782555793, valid_user:6040.0 ............................................................ndcg@1:0.24370860927152319, hit@1:0.24370860927152319, ndcg@5:0.406638924762195, hit@5:0.5533112582781456, ndcg@10:0.4437388075143209, hit@10:0.6670529801324503, ap:0.385885343545252, valid_user:6040.0 ............................................................ndcg@1:0.24503311258278146, hit@1:0.24503311258278146, ndcg@5:0.40708202926577325, hit@5:0.5533112582781456, ndcg@10:0.4445961973538064, hit@10:0.6685430463576159, ap:0.3864957036428388, valid_user:6040.0 ............................................................ndcg@1:0.2455298013245033, hit@1:0.2455298013245033, ndcg@5:0.4077146645047479, hit@5:0.5543046357615894, ndcg@10:0.4448187866218854, hit@10:0.6683774834437086, ap:0.3868417302898083, valid_user:6040.0 ............................................................ndcg@1:0.24519867549668875, hit@1:0.24519867549668875, ndcg@5:0.40756194630460774, hit@5:0.5543046357615894, ndcg@10:0.444561832998447, hit@10:0.6678807947019868, ap:0.386672691543328, valid_user:6040.0 ............................................................ndcg@1:0.24420529801324503, hit@1:0.24420529801324503, ndcg@5:0.40684643995653613, hit@5:0.5536423841059602, ndcg@10:0.44364797730057237, hit@10:0.6663907284768212, ap:0.3860289918791995, valid_user:6040.0 ............................................................ndcg@1:0.24486754966887417, hit@1:0.24486754966887417, ndcg@5:0.4070607410574634, hit@5:0.5531456953642384, ndcg@10:0.4442357483707688, hit@10:0.666887417218543, ap:0.3866187084025226, valid_user:6040.0