enoche / BM3

Pytorch implementation for "Bootstrap Latent Representations for Multi-modal Recommendation"-WWW'23
GNU General Public License v3.0
50 stars 8 forks source link

Hello, the baby data set recall@10 can only reach 0.0553, not 0.0564 in the paper. #2

Closed tabo0 closed 1 year ago

tabo0 commented 1 year ago

Hello, the baby data set recall@10 can only reach 0.0553, not 0.0564 in the paper.

enoche commented 1 year ago

Hi, thanks for running our code~ May I have a more detailed information about your settings? e.g., Server/Environment/Hyper-parameter ?

tabo0 commented 1 year ago

Hello, the code I ran in ubuntu18.4. The GPU is NVIDIA Corporation Device 2204. I'm running main.py without changing the code.

enoche commented 1 year ago

Thanks for your reply.

For your reference, the best hyper-parameters for baby is : ['n_layers', 'reg_weight', 'dropout', 'seed']=(1, 0.1, 0.5, 999), Here is the logs re-generated just now: Baby Logs

Noting different GPU server or seeds may have slightly varied resulted performance which is common in deep learning.

Pls drop me a line if you would like to strictly reproduce my results, I'd glad to print out my evaluation environment for your reference.

Hello, the code I ran in ubuntu18.4. The GPU is NVIDIA Corporation Device 2204. I'm running main.py without changing the code.

tabo0 commented 1 year ago

Thank you. Here's my training log. Why is my best result not the highest result of an epoch during training?

enoche commented 1 year ago

Thank you. Here's my training log. Why is my best result not the highest result of an epoch during training?

Good point! In the setting of this repo. best training model is selected as the best performance of validation dataset. Hence, this best model may not be the best in test dataset. While in some other settings, like SimGCL, the best model is selected out from test dataset, which is not applicable in real-world because the test dataset cannot be observed in advance.