xiangwang1223 / neural_graph_collaborative_filtering

Neural Graph Collaborative Filtering, SIGIR2019
MIT License
807 stars 268 forks source link

Can you release the code of the associated baseline approaches such as NeuMF? #16

Open Dinxin opened 5 years ago

Dinxin commented 5 years ago

I can feed the gowalla dataset into the NeuMF model implemented by the repo in the url "https://github.com/hexiangnan/neural_collaborative_filtering". Surprisingly, the NDCG@10 can achieve 0.3991... But you reported the NDCG@20 is only 0.1985... I am very confusing... Could you show me the baseline code (only the code of NeuMF is also acceptable.)

xiangwang1223 commented 5 years ago

Hi, Thanks for your interest.

  1. Please find the baseline codes for BPRMF and NMF under the 'NGCF' repo.
  2. Please check (i) whether you use the testing code in NGCF repo, since NMF only uses leaves one out to report the performance, while NGCF uses the full ranking of all items to report the performance; (ii) could you please report the NDCG@20 of NMF?

Thanks.

Jhy1993 commented 5 years ago

I use BPRMF, NMF and NGCF in this github and test their performance on ML-100k with 8-2 split. The results are shown as follows:

It seems NeuMF and NGCF can not beat BPRMF? Did you find similar phenomenon? BPRMF

recall=[0.21326, 0.33034, 0.47130, 0.56550, 0.62961, 0.68017],

precision=[0.32813, 0.26603, 0.20005, 0.16548, 0.14208, 0.12520],

hit=[0.90446, 0.95223, 0.97877, 0.98620, 0.99045, 0.99363],

ndcg=[0.40948, 0.43861, 0.50974, 0.56454, 0.60259, 0.63191]

NeuMF

rec=[0.17819 0.28363 0.41452 0.50636 0.57299 0.62618],

pre=[0.27473 0.22930 0.17781 0.14894 0.12938 0.11508],

hit=[0.87155 0.93949 0.96709 0.98195 0.98832 0.99151],

ndcg=[0.33682 0.37366 0.44835 0.50564 0.54613 0.57706],

NGCF recall=[0.17169 0.28164 0.41818 0.51180 0.57916 0.63578],

precision=[0.26783 0.22569 0.17757 0.14943 0.12984 0.11571],

hit=[0.85350 0.94161 0.97240 0.98408 0.99045 0.99363],

ndcg=[0.33925 0.37733 0.45599 0.51345 0.55410 0.58633]

xiangwang1223 commented 5 years ago

Could you please provide more details about your experiments, such as layer sizes, weight decay, learning rate, and pretrain for each model?

Based on my experiences, one possible reason is that the ML-100k dataset is too small and dense, where only 100k interactions are involved with 1000 users and 1700 items. And it is well-known that deep recommenders need to avoid overfitting.

Jhy1993 commented 5 years ago

I try to impore NGCF on ml-100k and get better performance: recall=[0.22461 0.34227 0.48947 0.58285 0.65171 0.69873],

precision=[0.34437 0.27734 0.20985 0.17335 0.14878 0.13023],

hit=[0.91507 0.95860 0.98301 0.98514 0.98832 0.99045],

ndcg=[0.42971 0.45348 0.52414 0.57802 0.61781 0.64464] NGCF can perform better than BPRMF. However, NeuMF still can not beat BPRMF. I test NeuMF with different lr, embed_dim and reg (e.g., lr=1e-4, keep_prob=0.8, layer_size=64, embed_size=64, reg=1e-5). The best performance of NeuMF on ml-100k: recall=[0.19349, 0.30015, 0.44490, 0.53522, 0.60039, 0.65523],

preci=[0.29766, 0.24225, 0.18848, 0.15681, 0.13563, 0.12051],

hit=[0.89066, 0.94055, 0.97240, 0.98089, 0.98514, 0.98938],

ndcg=[0.37598, 0.40518, 0.48263, 0.53730, 0.57684, 0.60823]

TedSIWEILIU commented 5 years ago

Can I ask how to run the baseline code such as NeuMF? I use the following command to run: python NMF.py --dataset gowalla --regs [1e-3] --embed_size 64 --layer_size [64,64] --lr 0.0001 --save_flag 1 --pretrain 0 --batch_size 1024 --epoch 400 --verbose 1 --keep_prob [0.9,0.9] (I also add the keep_prob argument in the parser.py file)

But it always gave me the following error: File "NMF.py", line 118, in create_bpr_loss reg_loss = self.regs[-2] * tf.nn.l2_loss(self.weights['h']) IndexError: list index out of range

Do you know what's the problem?

xiangwang1223 commented 5 years ago

... since you have two layers when setting --layer size as [64,64], the parameters --regs should be correspondingly set as a list of two entries like --regs [1e-3,1e-3].

TedSIWEILIU commented 5 years ago

Thanks very much, it works now!

Can I ask one more question, why the commands you provided do not have this problem? python NGCF.py --dataset gowalla --regs [1e-5] --embed_size 64 --layer_size [64,64,64]