General Model Performs Better than Context-aware Model

PosoSAgapo commented 1 year ago

Hi, I am using both the general model and the context-aware model on my own dataset which only has implicit feedback. As the context-aware model uses the item feature, I would suppose the context-aware model could give a better performance. However, in my experiment setting, the context-aware model usually gives a much worse performance compared to the general model. This is what my item file looks like:

item:token  feat_A:token_seq    feat_B:token    feat_C:token_seq    feat_D:token_seq
0   1 2 3 4 8   77  337 304 325 776 917 2172 2464 2756
1   1 3 8   77  300 1354 2571 2656

Those integers are the anonymized categorical feature, so I suppose its feature type is not float_seq As my custom dataset only contains implicit feedback, so I can only use the negative sampling approach to train the model. I use the below configuration:

load_col:
  inter: [user, item]
  user: [user, zip_code]
  item: [item, feat_A, feat_B, feat_C, feat_D]
epochs: 500
train_batch_size: 4096
learner: adam
learning_rate: 0.001
eval_step: 1
stopping_step: 30
# evaluation setting
eval_args:                      # (dict) 4 keys: group_by, order, split, and mode
  split: {'RS':[0.9,0.1,0.0]}   # (dict) The splitting strategy ranging in ['RS','LS'].
  group_by: user                # (str) The grouping strategy ranging in ['user', 'none'].
  order: RO                     # (str) The ordering strategy ranging in ['RO', 'TO'].
  mode: full
train_neg_sample_args: {distribution: 'uniform', sample_num: 1}
metrics: ["Recall", "MRR", "NDCG", "Hit", "Precision"]
topk: [50]
valid_metric: Hit@50
eval_batch_size: 4096

As I have tried several combinations of hyperparameters in different context-aware models, but unfortunately, none of them could give a comparable performance with general models like BPR or SimpleX. However, since we can use item features in the context-aware model, it is expected to see a boost in performance. What may cause this problem? Is my dataset only contain implicit feedback the problem?

Ethan-TZ commented 1 year ago

@PosoSAgapo Hello, thanks for your attention to RecBole!

Generally speaking, the general methods are better competent for the recall task that has implicit feedback only and utilizes sampling methods to generate negative samples.
In contrast, context-aware methods are designed mainly for the ranking task, which has only a small number of candidate items to score.
Specially, the main difference between them lies on the loss funciton, i.e., BPR loss for general methods and BCE loss for context-aware methods. If you want to gain with context-aware methods, you can try to implement the BPR loss for them, which is much more effective for the recall task.

I hope my answer can help you!

PosoSAgapo commented 1 year ago

@chenyuwuxin Thanks for your response! Based on your reply, if I want to use features in implicit feedback, I should implement the BPR loss function for the context-aware model. Since the context-aware model uses BCE loss, this loss is not proper for implicit feedback trained using negative sampling. Is this understanding correct?

Ethan-TZ commented 1 year ago

@PosoSAgapo Yes, since BCE Loss is usually applicable to fine-grained ranking tasks, while BPR Loss is often used in the coarse-grained recall tasks.

RUCAIBox / RecBole

General Model Performs Better than Context-aware Model #1555