Librec-like Implementation of timeSVD++ on Python

trimkaleci commented 6 years ago

Hello to everyone,

I have done an implementation of timeSVD++, which is similar to the one implemented on Librec. For experimenting with it, I am using the dataset provided on Librec (which is for testing purposes), "ratings-date.txt". For splitting the dataset into training and testing data, I chose the same numbers of rows for the training set and the testing set (random fixed number of rows is chosen for testing, and the rest is used for training). Also, the same initialization procedure has been followed, and similar initial values have been used. After doing the evaluation using RMSE, in 80 iterations, the RMSE value of my implementation is 0.965698745178, whereas the RMSE value when the implementation of timeSVD++ on Librec is used is 0.914211221245.

I have also tested my implementation on 100K MovieLens dataset (u1.base for training, u1.test for testing), where I get an RMSE value of 1.8956 (here I, have set to 20 the number factors, and to 10 the number of bins, and some initial parameter values such as user bias and item bias to zero for each user and item respectively).

I am sharing my implementation with you, which can be found here.

I would really appreciate if someone finds time and check my code quickly, and see if there is any mistake, or any suggestion about what could be improved.

Thanks a lot!

SunYatong commented 6 years ago

Hi, I tested your code for five times, and the best rmse results on ratings-data.txt are around 0.85\~0.87(at epoch 7). The rmse results of LibRec's TimeSVD are around 0.86\~0.87(at epoch 12). It seems that your result is slightly better than LibRec's. Therefore your implementation should be right.

trimkaleci commented 6 years ago

@SunYatong Thanks a lot for finding time and testing my implementation! Could you please let me know what do you mean by epoch 7/12 (I know that we split the dataset into bins because we need it for the item bias parameter, and as for the rating-date.txt there are data where all the ratings are given in single day, we deal with only 1 bin)?

SunYatong commented 6 years ago

I mean the model is converged at the 7th iteration(epoch), then its testing error begins to rise.

On Fri, Dec 1, 2017 at 6:53 PM, Leo10 notifications@github.com wrote:

@SunYatong https://github.com/sunyatong Thanks a lot for finding time and testing my implementation! Could you please let me know what do you mean by epoch 7/12 (I know that we split the dataset into bins because we need it for the item bias parameter, and as for the rating-date.txt there are data where all the ratings are given in single day, we deal with only 1 bin)?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/guoguibing/librec/issues/223#issuecomment-348464875, or mute the thread https://github.com/notifications/unsubscribe-auth/AQWU7oFc9I9rQ9KskrzOAwq8tZF9Tgmuks5s79qkgaJpZM4QwePh .

guoguibing / librec

Librec-like Implementation of timeSVD++ on Python #223