facebookresearch / generative-recommenders

Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).
Apache License 2.0
743 stars 138 forks source link

Evaluation on public dataset #94

Open Blank-z0 opened 1 month ago

Blank-z0 commented 1 month ago

Hi, great work! I'm trying to reproduce the results on public datasets. However, I only found the training codes, where the model was evaluated on the eval set (or you don't use train/eval/test spilt, only train/test split?). I’d like to know if you partitioned the public dataset into a test set, and whether the results reported in the paper correspond to the test set or the eval set. If I want to partition a test set, should I set ignore_last_n=0,1,2 when loading the test, eval and train dataset? For example:

train_dataset = DatasetV2(
      ratings_file=dp.output_format_csv(),
      padding_length=max_sequence_length + 1,  # target
      ignore_last_n=2,
      chronological=chronological,
)
eval_dataset = DatasetV2(
      ratings_file=dp.output_format_csv(),
      padding_length=max_sequence_length + 1,  # target
      ignore_last_n=1,
      chronological=chronological,
)
test_dataset = DatasetV2(
      ratings_file=dp.output_format_csv(),
      padding_length=max_sequence_length + 1,  # target
      ignore_last_n=0,
      chronological=chronological,
)
jiaqizhai commented 1 month ago

Thanks for your interest in our work! Please check https://github.com/facebookresearch/generative-recommenders/issues/8