The current CLM strategy is as follows
The main issues of the current implementation are:
We mask useful past information when we evaluate on the last item only ==> This is wrong as past information is a valuable context for predictions.
At inference, the padded positions are represented with 0-embeddings while during training these positions are replaced with trainable [MASK] embeddings. ==> We should have the same training, evaluation, and inference representation strategy.
Implementation Details :construction:
Updated the class CausalLanguageModeling to:
Replace padded positions with [MASK] embeddings at inference.
During the evaluation of the last item, I defined the label_mask as the padding mask information (to keep information about actual past items).
I ran the t4r_paper_repro script using 5 days of ecomrees46 dataset and these are the results:
**CLM run using main branch (w/o fix)**
Recall@10 of manually masked test data = 0.3915199603272998
eval//next-item/recall_at_10 0.25108
Fix the evaluation on last item by not masking the past information
eval_/next-item/recall_at_10 = 0.2980385422706604
Recall@10 of manually masked test data = 0.29812381188527975
### Testing Details :mag:
- Changed `test_mask_only_last_item_for_eval` to get target mask information from `lm.masked_targets`
- Changed `test_sequential_tabular_features_ignore_masking` as the inference mode of CLM is changing the inputs by replacing `0` padded positions with [MASK] embeddings
## Future work
- The [MASK] embeddings is not used to generate the next-item prediction scores and this raises a question about whether we should remove it and just use `0`-embedding to represent padded positions.
==> We need to re-run T4Rec paper experiments without [MASK] variable and check how the evaluation results are impacted.
Fixes #719
Goals :soccer:
The current CLM strategy is as follows The main issues of the current implementation are:
0
-embeddings while during training these positions are replaced with trainable [MASK] embeddings. ==> We should have the same training, evaluation, and inference representation strategy.Implementation Details :construction:
Updated the class
CausalLanguageModeling
to:label_mask
as the padding mask information (to keep information about actual past items).I ran the
t4r_paper_repro
script using 5 days of ecomrees46 dataset and these are the results:Fix the evaluation on last item by not masking the past information eval_/next-item/recall_at_10 = 0.2980385422706604 Recall@10 of manually masked test data = 0.29812381188527975