Open outside-BUPT opened 3 years ago
@outside-BUPT how many steps did you train the model?
I tried to run run_ml-1m.sh and also got bad evaluation performance as below. I noticed the script sets train steps to 400000.
dcg@1:0.016556291390728478, hit@1:0.016556291390728478, ndcg@5:0.041968522744744094, hit@5:0.06837748344370861, ndcg@10:0.06059810508379426, hit@10:0.12682119205298012, ap:0.06324608749550562, valid_user:6040.0 INFO:tensorflow:Inference Time : 27.27113s I1007 17:31:45.042668 140571207984832 evaluation.py:269] Inference Time : 27.27113s INFO:tensorflow:Finished evaluation at 2023-10-07-17:31:45 I1007 17:31:45.043014 140571207984832 evaluation.py:271] Finished evaluation at 2023-10-07-17:31:45 INFO:tensorflow:Saving dict for global step 400000: global_step = 400000, loss = 9.357097, masked_lm_accuracy = 0.0013245033, masked_lm_loss = 9.3572
@oscarriddle The default number of training steps in this script needs to be higher to reproduce the reported result. If you want to get closer to the reported results, you'll have to increase the number of steps by a factor of 20 (e. g. 8M or even more), but the training will take a lot of time.
See our reproducibility paper on this https://browse.arxiv.org/pdf/2207.07483.pdf
@oscarriddle The default number of training steps in this script needs to be higher to reproduce the reported result. If you want to get closer to the reported results, you'll have to increase the number of steps by a factor of 20 (e. g. 8M or even more), but the training will take a lot of time.
See our reproducibility paper on this https://browse.arxiv.org/pdf/2207.07483.pdf
After training about 35,000,000 steps, still wandering around a rather low saddle point, merely above SASrec's performance mentioned in paper. ............................................................ndcg@1:0.24486754966887417, hit@1:0.24486754966887417, ndcg@5:0.4067822076254277, hit@5:0.5536423841059602, ndcg@10:0.44381509752787934, hit@10:0.6673841059602649, ap:0.38591140681367797, valid_user:6040.0 ............................................................ndcg@1:0.24519867549668875, hit@1:0.24519867549668875, ndcg@5:0.40733902240485165, hit@5:0.5538079470198676, ndcg@10:0.44420200049712466, hit@10:0.666887417218543, ap:0.3865876645860639, valid_user:6040.0 ............................................................ndcg@1:0.24486754966887417, hit@1:0.24486754966887417, ndcg@5:0.4067814130590064, hit@5:0.5524834437086092, ndcg@10:0.4443041962387484, hit@10:0.6675496688741722, ap:0.3864707774135114, valid_user:6040.0 ............................................................ndcg@1:0.24519867549668875, hit@1:0.24519867549668875, ndcg@5:0.40754205560187584, hit@5:0.5539735099337748, ndcg@10:0.4447855702971012, hit@10:0.6683774834437086, ap:0.38679013575800086, valid_user:6040.0 ............................................................ndcg@1:0.24437086092715232, hit@1:0.24437086092715232, ndcg@5:0.40725909975358604, hit@5:0.5543046357615894, ndcg@10:0.4443552628363676, hit@10:0.6683774834437086, ap:0.38621019782555793, valid_user:6040.0 ............................................................ndcg@1:0.24370860927152319, hit@1:0.24370860927152319, ndcg@5:0.406638924762195, hit@5:0.5533112582781456, ndcg@10:0.4437388075143209, hit@10:0.6670529801324503, ap:0.385885343545252, valid_user:6040.0 ............................................................ndcg@1:0.24503311258278146, hit@1:0.24503311258278146, ndcg@5:0.40708202926577325, hit@5:0.5533112582781456, ndcg@10:0.4445961973538064, hit@10:0.6685430463576159, ap:0.3864957036428388, valid_user:6040.0 ............................................................ndcg@1:0.2455298013245033, hit@1:0.2455298013245033, ndcg@5:0.4077146645047479, hit@5:0.5543046357615894, ndcg@10:0.4448187866218854, hit@10:0.6683774834437086, ap:0.3868417302898083, valid_user:6040.0 ............................................................ndcg@1:0.24519867549668875, hit@1:0.24519867549668875, ndcg@5:0.40756194630460774, hit@5:0.5543046357615894, ndcg@10:0.444561832998447, hit@10:0.6678807947019868, ap:0.386672691543328, valid_user:6040.0 ............................................................ndcg@1:0.24420529801324503, hit@1:0.24420529801324503, ndcg@5:0.40684643995653613, hit@5:0.5536423841059602, ndcg@10:0.44364797730057237, hit@10:0.6663907284768212, ap:0.3860289918791995, valid_user:6040.0 ............................................................ndcg@1:0.24486754966887417, hit@1:0.24486754966887417, ndcg@5:0.4070607410574634, hit@5:0.5531456953642384, ndcg@10:0.4442357483707688, hit@10:0.666887417218543, ap:0.3866187084025226, valid_user:6040.0
my result: hit@1:0.2399, ndcg@5:0.393, hit@5:0.536, ndcg@10:0.4379=, hit@10:0.6718, ap:0.3794
result from paper: hit@1:0.3440 , ndcg@5:0.4967, hit@5:0.6323, ndcg@10:0.5340, hit@10:0.7473, ap:0.4785