Performance is not as good as reported

donalee / BUIR

Bootstrapping User and Item Representations for One-Class Collaborative Filtering

29 stars 3 forks source link

Performance is not as good as reported #3

Closed Coder-Yu closed 2 years ago

Coder-Yu commented 2 years ago

I have tested BUIR on Yelp2018 without filtering long-tail users/items out. The performance is weak, only 3/4 of LightGCN's performance. Could the authors give some advices?

donalee commented 2 years ago

Hi, did you compare the performance of LightGCN-BPR with that of LightGCN-BUIR?

Coder-Yu commented 2 years ago

I used BUIR-NB backed by LightGCN and compared it with the vanilla LightGCN. I have implemented BUIR with both TF and torch by referring to your implementations. Their performances all fall short of expectation. May I know the expected epoch number for convergence? I noticed that this method converges slowly and I just run it on Yelp2018 with 30 epochs.

Coder-Yu commented 2 years ago

The performance increases with very smaller number after the first 10 epochs. I wonder if it needs hundreds of epochs to reach its best performance. If so, it would be a little inefficient despite a simple structure. My code is here https://github.com/Coder-Yu/SELFRec/blob/main/model/graph/BUIR.py

Other integrated methods in this library work very well. So, are there tricks needed for BUIR to outperform LightGCN?

donalee commented 2 years ago

Thanks for your information. It looks a really nice integrated library for recommender systems. Just now I checked my working directory again, and I found that the number of epochs to be required for LightGCN-BUIR convergence is about ~600 (for all the datasets used in my experiments). For this reason, I set the maximum number of epochs to 1000, and I updated the parameters until the validation performance does not increase any longer. Yes, I agree that it is inefficient. Another interesting finding is that LightGCN-BUIR seems to require a particularly larger number of epochs, compared to MF-BUIR.

As you know, according to the recent development of the bootstrapping-based self-supervised learning in computer vision, SimSiam-like variant of BUIR can improve its training efficiency. I think the EMA part of the target encoder makes the training process really inefficient even though it brings a performance improvement.

Coder-Yu commented 2 years ago

Thanks for your feedback. I also noticed the advance of SSL in CV and agree with you. I will try to assign a large epoch number to BUIR. By the way, according to my experience, normalizing the layer embedding in LightGCN will speed up the training. But it seems that this trick does not work on the two-branch structure of BUIR. Our recent method SimGCL (https://github.com/Coder-Yu/SELFRec/blob/main/model/graph/SimGCL.py) which is published by SIGIR'22, is far better than LightGCN and another SSL-based method SGL. It requires only 15 epochs to converge on Yelp2018 where the infoNCE is used. I think the negative sampling-based CL loss helps to speed up training.

donalee commented 2 years ago

And, one more point that I want to emphasize is, BUIR can be more effective than BPR in case that there exist a lot of "positive-but-unobserved" items in a dataset (and such items are correctly considered as the test items for evaluation).

If you do not filter out long-tail items, the proportion of "positive-but-unobserved" items would decrease definitely.
If you use only a small number of test positive items for evaluation, the incorrect supervision of BPR could not be captured into the final performance.

For this reason, I do not think BUIR can always perform better than BPR in small and artificially-designed experiments that we have usually focused on. From this perspective, infoNCE can be much more effective and efficient as you mentioned. Anyway, thanks for letting me know your recent work. I will refer to it for our future work :-)!

Coder-Yu commented 2 years ago

Agree on your comments. It's nice to talk to you :-) .