Open Dinxin opened 5 years ago
Hi, Thanks for your interest.
Thanks.
I use BPRMF, NMF and NGCF in this github and test their performance on ML-100k with 8-2 split. The results are shown as follows:
It seems NeuMF and NGCF can not beat BPRMF? Did you find similar phenomenon? BPRMF
recall=[0.21326, 0.33034, 0.47130, 0.56550, 0.62961, 0.68017],
precision=[0.32813, 0.26603, 0.20005, 0.16548, 0.14208, 0.12520],
hit=[0.90446, 0.95223, 0.97877, 0.98620, 0.99045, 0.99363],
ndcg=[0.40948, 0.43861, 0.50974, 0.56454, 0.60259, 0.63191]
NeuMF
rec=[0.17819 0.28363 0.41452 0.50636 0.57299 0.62618],
pre=[0.27473 0.22930 0.17781 0.14894 0.12938 0.11508],
hit=[0.87155 0.93949 0.96709 0.98195 0.98832 0.99151],
ndcg=[0.33682 0.37366 0.44835 0.50564 0.54613 0.57706],
NGCF recall=[0.17169 0.28164 0.41818 0.51180 0.57916 0.63578],
precision=[0.26783 0.22569 0.17757 0.14943 0.12984 0.11571],
hit=[0.85350 0.94161 0.97240 0.98408 0.99045 0.99363],
ndcg=[0.33925 0.37733 0.45599 0.51345 0.55410 0.58633]
Could you please provide more details about your experiments, such as layer sizes, weight decay, learning rate, and pretrain for each model?
Based on my experiences, one possible reason is that the ML-100k dataset is too small and dense, where only 100k interactions are involved with 1000 users and 1700 items. And it is well-known that deep recommenders need to avoid overfitting.
I try to impore NGCF on ml-100k and get better performance: recall=[0.22461 0.34227 0.48947 0.58285 0.65171 0.69873],
precision=[0.34437 0.27734 0.20985 0.17335 0.14878 0.13023],
hit=[0.91507 0.95860 0.98301 0.98514 0.98832 0.99045],
ndcg=[0.42971 0.45348 0.52414 0.57802 0.61781 0.64464] NGCF can perform better than BPRMF. However, NeuMF still can not beat BPRMF. I test NeuMF with different lr, embed_dim and reg (e.g., lr=1e-4, keep_prob=0.8, layer_size=64, embed_size=64, reg=1e-5). The best performance of NeuMF on ml-100k: recall=[0.19349, 0.30015, 0.44490, 0.53522, 0.60039, 0.65523],
preci=[0.29766, 0.24225, 0.18848, 0.15681, 0.13563, 0.12051],
hit=[0.89066, 0.94055, 0.97240, 0.98089, 0.98514, 0.98938],
ndcg=[0.37598, 0.40518, 0.48263, 0.53730, 0.57684, 0.60823]
Can I ask how to run the baseline code such as NeuMF?
I use the following command to run:
python NMF.py --dataset gowalla --regs [1e-3] --embed_size 64 --layer_size [64,64] --lr 0.0001 --save_flag 1 --pretrain 0 --batch_size 1024 --epoch 400 --verbose 1 --keep_prob [0.9,0.9]
(I also add the keep_prob argument in the parser.py file)
But it always gave me the following error:
File "NMF.py", line 118, in create_bpr_loss reg_loss = self.regs[-2] * tf.nn.l2_loss(self.weights['h']) IndexError: list index out of range
Do you know what's the problem?
... since you have two layers when setting --layer size
as [64,64]
, the parameters --regs
should be correspondingly set as a list of two entries like --regs [1e-3,1e-3]
.
Thanks very much, it works now!
Can I ask one more question, why the commands you provided do not have this problem?
python NGCF.py --dataset gowalla --regs [1e-5] --embed_size 64 --layer_size [64,64,64]
I can feed the gowalla dataset into the NeuMF model implemented by the repo in the url "https://github.com/hexiangnan/neural_collaborative_filtering". Surprisingly, the NDCG@10 can achieve 0.3991... But you reported the NDCG@20 is only 0.1985... I am very confusing... Could you show me the baseline code (only the code of NeuMF is also acceptable.)