reczoo / RecZoo

A curated model zoo for recommendation tasks
Apache License 2.0
163 stars 38 forks source link

About MovieLens-1M Dataset #9

Closed xizhu1022 closed 2 years ago

xizhu1022 commented 2 years ago

Thanks for your interesting work! I found the ml-1m dataset provided in this repo may be inconsistent with that reported in the original paper, see the data statistic table in 4.1. Only 895699 interactions in your dataset but 995154 are reported. Could you please update the dataset or explain it? I also wonder how you process the dataset (e.g., filter users or items, split train/validation/test datasets)? Looking forward to reply!

kyriemao commented 2 years ago

Thanks for your attention. This is because the missing (995154 - 895699) interactions are just used for validation, which is the same as the LCFN original paper. They use 796244, 99455, and 99455 interactions for training, test, and validation respectively, and we exactly follow their data split. You can find their original dataset in https://github.com/Wenhui-Yu/LCFN/tree/master/dataset/Movielens.

xizhu1022 commented 2 years ago

Thank you! Your reply helps. I will close this issue.