Official implementation of MEANTIME: Mixture of Attention Mechanisms with Multi-temporal Embeddings for Sequential Recommendation , published in RecSys 2020: 14th ACM Conference on Recommender Systems.
This repository contains PyTorch implementation of MEANTIME, as well as PyTorch translations of various baselines (refer to References below).
Install the required packages into your python environment:
pip install -r requirements.txt
This code was tested with python 3.6.9
on ubuntu
with cuda 10.1
and various types of GPUs.
In order to train MEANTIME, run run.py as follows:
python run.py --templates train_meantime
This will apply all the options specified in templates/train_meantime.yaml
, and train MEANTIME on MovieLens 1M dataset as a result.
You can also apply other templates in the templates
folder. For example,
python run.py --templates train_bert
will train BERT4Rec model instead of MEANTIME.
It is also possible to override some options with command line arguments. For example,
python run.py --templates train_meantime --dataset_code game --hidden_units 128
will use Amazon Game dataset and hidden dimension of 128.
Check out meantime/options
for all possible options.
Here is a more detailed explanation of how one can train a model.
We will explain in two levels ('Big' Choices and 'Small' Choices).
Remember that you can always study the template files in ./templates
to learn about desirable choices.
This project is highly modularized so that any (valid) combination of model
, dataset
, dataloader
, negative_sampler
and trainer
will run.
Currently, this repository provides implementations of MEANTIME and several other baselines (References).
Choose one of these for --model_code
option
We experimented with four datasets: MovieLens 1M, MovieLens 20M, Amazon Beauty and Amazon Game.
Choose one of these for --dataset_code
option
The raw data of these datasets will be automatically downloaded to ./Data
the first time they are required.
They will be preprocessed according to the related hyperparameters and will be saved also to ./Data
for later re-use.
Note that downloading/preprocessing is done only once per every setting to save time.
If you want to change the Data folder's path from ./Data
to somewhere else (e.g. shared folder), modify LOCAL_DATA_FOLDER
variable in meantime/config.py
.
There is a designated dataloader for each model. Choose the right one for --dataloader_code
option:
The separation is due to the way each model calculates the training loss, and the information they require.
There is a designated trainer for each model. Choose the right one for --trainer_code
option
However, at this point, all trainers have the exact same implementation thanks to the abstraction given by the models.
There are two types of negative samplers:
Choose one for --train_negative_sampler
(used for training) and --test_negative_sampler
(used for evaluation).
For every big choice, one can make small choices to modify the hyperparameters that are related to the big choice.
Since there are too many options, we suggest looking at meantime/options
for complete list.
Here we will just present some of the important ones.
--max_len
: The length of any transformer-based models--hidden_units
: The size of hidden dimension--num_blocks
: Number of transformer layers--num_heads
: Number of attention heads--absolute_kernel_types
: Absolute Temporal Embedding types to be used in MEANTIME. Look at meantime/options
for further information--relative_kernel_types
: Relative Temporal Embedding types to be used in MEANTIME. Look at meantime/options
for further information--min_rating
: Minimum rating to regard as implicit rating. Interactions whose rating is below this value will be discarded--min_uc
: Discard users whose number of ratings is below this value--min_sc
: Discard items whose number of ratings is below this value--dataloader_output_timestamp
: If true, the dataloader outputs timestamp information--train_window
: How much to slide the training window to obtain subsequences from the user's entire item sequence--train_batch_size
: Batch size for training--device
: CPU or CUDA--use_parallel
: If true, the program uses all visible cuda devices with DataParallel--optimizer
: Model optimizer (SGD or Adam)--lr
: Learning rate--saturation_wait_epochs
: The training will stop early if validation performance doesn't improve for this number of epochs.--best_metric
: This metric will be used to compare and determine the best model--train_negative_sample_size
: Negative sample size for training--test_negative_sample_size
: Negative sample size for testingAfter the training is over, the results will be saved in the folder specified by the options.
More specifically, they are saved in experiment_root
/experiment_group
/experiment_name
For example, train_meantime
template has
experiment_root: experiments
experiment_group: test
experiment_name: meantime
Therefore, the results will be saved in experiments/test/meantime
.
We suggest the users to modify the experiment_group
and experiment_name
options to match their purpose.
In the result folder, you will find:
.
├── config.json
├── models
│ ├── best_model.pth
│ ├── recent_checkpoint.pth
│ └── recent_checkpoint.pth.final
├── status.txt
└── tables
├── test_log.csv
├── train_log.csv
└── val_log.csv
Below are the descriptions for the contents of each file.
If you want to evaluate a pretrained model, you can simply override the --mode
option and provide path to the --pretrained_weights
.
For example,
python run.py --templates train_meantime --mode validate --pretrained_weights path/to/weights
will validate the pretrained model on validation data, and
python run.py --templates train_meantime --mode test --pretrained_weights path/to/weights
will do the same on test data.
The above table shows the performance of each model on all four datasets.
We performed grid-search over hyperparameters, and reported optimal results for each combination.
Please refer to our paper for further informaiton.
WANDB is an online service that helps you organize and analyze machine learning experiments.
If you want to log the results to wandb:
requirements.txt
)USE_WANDB
to True
in meantime/config.py
wandb init
in command line to initialize wandb with your own accountSometimes, when there is not enough time to run all the experiments that you wish to run (especially before the paper's due date), one feels the need to use multiple "remote" machines to run multiple experiments in parallel.
By default, those machines will each save their results in their local result folders (refer to above).
It would be useful if all of those results could be gathered in one designated host machine, making it easy to view and analyze the entire process.
If you want the remote machine to upload their results to the host machine after every experiment:
requirements.txt
)meantime/config.py
HOST
(str): ip of the host machinePORT
(int): port of the host machineUSERNAME
(str): username for ssh/sftp access to host machinePASSWORD
(str): password for ssh/sftp access to host machineREMOTE_ROOT
(str): path to this project in the host machineMACHINE_IS_HOST
to False
in meantime/config.py
The baseline codes were translated to PyTorch from the following repositories: