beta-team / beta-recsys

Beta-RecSys: Build, Evaluate and Tune Automated Recommender Systems
https://beta-recsys.readthedocs.io/en/latest/
MIT License
163 stars 33 forks source link

Develop #385

Closed mdimitrov97 closed 3 years ago

mdimitrov97 commented 3 years ago

Attaching a pull request for initial TiSASRec implementation, I have provided a summary of everything that I've changed including points in which I'm not sure I'm doing the right thing

Code adapted from https://github.com/pmixer/TiSASRec.pytorch This person has also done the SASRec which I believe was used in beta-recsys I followed the structure of how SASRec is done on beta-recsys

Changes in the following files:

__examples/TiSASRec_Movielens.ipynb__ Notebook following the template of the other notebooks

__configs/tisasrec_default.json__ JSON config following the template of the other json files

__beta_rec/recommenders/init.py__ Added imports and the name of the model to this

__beta_rec/models/tisasrec.py__ This is based on https://github.com/pmixer/TiSASRec.pytorch/blob/master/model.py I think the classes in this file are okay and hopefully there are no issues here

__beta_rec/recommenders/tisasrec.py__ based on https://github.com/pmixer/TiSASRec.pytorch/blob/master/main.py and https://github.com/pmixer/TiSASRec.pytorch/blob/master/utils.py as well as the beta-recsys implementation of SASRec Main issue here is that I'm not sure what to pass as an argument to seq_train_time function, line 323, I thought I would pass data.get_train_seq() however this lead to a number of indexing issues so I started working through these. I then started looking at the code from the original TiSASRec implementation and I have a number of functions which I'm not sure whether are really needed - timeSlice, cleanSort, computeRePos, dataPartition. I thought the result from dataPartition would solve all the indexing/data representation issues but I couldn't get it to work. It also seems to only work with a specific way the data is represented so I've included the original ml-1m.txt file from the github repo for testing the code. I'm assuming that this is stored in the main beta-recsys folder for now. ml-1m.txt

__beta_rec/core/eval_engine.py and core/train_engine.py__

Due to some differences in indexing with TiSASec I had to implement some new functions in eval_engine.py and train_engine.py. seq_predict_time, test_seq_predict_time - these two are used in seq_train_eval_time seq_train_eval_time is then called in _seq_train_time I followed the pipeline of the seq_predict, test_seq_predict, seq_train_eval, seq_train functions The reason I did this is that for example in line 368 in test_seq_predict we have seq[idx] = i which is for SASRec but in TiSASRec this needs to be seq[idx] = i[0] and time_seq[idx] = i[1] (lines 421 and 422 in seq_predict_time). I'm not sure if this is needed at all but I thought it would solve the issues with indexing. Most of this new code reuses some functions from https://github.com/pmixer/TiSASRec.pytorch/blob/master/utils.py