Closed TheEdoardo93 closed 7 years ago
it is matrix factorization. The goal is to minimize the difference between reconstructed matrix and the input matrix. they don't have to be identical.
Okay, thanks for the answer, but I have other questions about this topic. 1) I don't understand why if I apply PMF or BPMF model with this library I obtain latent features with also negative values. The ratings matrix contains only 0, 1, 2, 3, 4 and 5 entry (no negative values). It is possible? I cut and paste your example code 2) If I apply NMF (Non-negative Matrix Factorization) technique (from sklearn.decomposition import NMF) on a matrix A with dimensions [10,10]: A = [[ 0.12625305 0.30978375 0.30415786 0.42546815 0.84200692 0.23241193 0.78670725 0.7618932 0.69563467 0.5866224 ] [ 0.01743334 0.53193025 0.24858198 0.96367046 0.32004272 0.65359156 0.45029779 0.78915741 0.99218829 0.92447303] [ 0.52022379 0.02042781 0.1915887 0.94017405 0.21289058 0.95838386 0.50880476 0.32742556 0.87621962 0.84681934] [ 0.0282506 0.06367262 0.35524991 0.62248901 0.95327913 0.54744748 0.35342385 0.25593403 0.49169813 0.19864078] [ 0.23601037 0.45487854 0.95634693 0.87010446 0.81376886 0.00181897 0.35715158 0.59365696 0.52841866 0.8638987 ] [ 0.44691085 0.79517402 0.15642157 0.2438334 0.07887692 0.79571398 0.26534944 0.86961929 0.27328381 0.19134173] [ 0.3439889 0.40121153 0.03619149 0.95924414 0.3170521 0.02388609 0.04856726 0.41966615 0.28492617 0.20625199] [ 0.74082635 0.60233115 0.34709545 0.02091571 0.88651045 0.81214676 0.95056368 0.90685866 0.22626875 0.06767816] [ 0.83163061 0.54685084 0.45286537 0.68746794 0.30366078 0.24971967 0.87337281 0.41271786 0.93144887 0.48590846] [ 0.49507609 0.26737458 0.3843407 0.99255407 0.30388398 0.02507288 0.53330802 0.11694593 0.75758704 0.35467839]]
I obtain two matrices W and H which if I multiply them I can obtain an "optimal" approximation of A. W x H matrix multiplication:[[ 0.12820193 0.31069438 0.30316712 0.42462545 0.84311116 0.23203029 0.78541355 0.76057125 0.69688356 0.58747347] [ 0.01639517 0.53052884 0.25015509 0.96521958 0.31817125 0.65419791 0.45312595 0.79104169 0.99001833 0.92284522] [ 0.52101971 0.02432804 0.1905339 0.9391737 0.21415551 0.95822023 0.50637001 0.32673502 0.87736515 0.84793737] [ 0.02886074 0.06350618 0.35528845 0.62284234 0.95260781 0.54757363 0.35455101 0.25611015 0.49123933 0.19864947] [ 0.23600156 0.45496531 0.95640733 0.86995638 0.81356006 0.00377819 0.35700375 0.59335604 0.52897431 0.86398017] [ 0.44789327 0.7961735 0.15593094 0.2425944 0.08090765 0.79507688 0.26623304 0.86830453 0.2751274 0.19189357] [ 0.34379423 0.40175899 0.03695956 0.95960783 0.31617689 0.02395014 0.0507778 0.41913215 0.2842138 0.20679367] [ 0.73995924 0.60153788 0.34757634 0.02179307 0.88629989 0.8125988 0.95038521 0.90782452 0.22656139 0.06839238] [ 0.83063422 0.54603023 0.45419856 0.68875036 0.30265228 0.24996152 0.87476172 0.41370722 0.93008105 0.48516807] [ 0.49604429 0.26828597 0.38257796 0.99124156 0.3061904 0.02519926 0.53173017 0.11655131 0.75897825 0.35521481]]
If I apply PMF or BPMF model on a similar ratings matrix I obtain a result which in my opinion is wrong. It is wrong because I obtain values too much little and also negative values. It is possible? What is the reason? If I multiply the latent features I obtain a matrix completely different from the initial ratings matrix. The following example shows better what I say before. B is the initial ratings matrix whose shape is [10,10]. B = [[ 4. 0. 0. 4. 4. 0. 1. 0. 0. 4.] [ 1. 1. 0. 1. 0. 0. 0. 3. 1. 0.] [ 3. 5. 0. 3. 3. 0. 0. 0. 0. 3.] [ 4. 4. 0. 0. 0. 0. 1. 0. 4. 4.] [ 4. 0. 0. 4. 1. 0. 0. 4. 0. 4.] [ 1. 4. 0. 0. 4. 4. 4. 0. 0. 0.] [ 1. 0. 0. 4. 0. 4. 0. 4. 1. 0.] [ 4. 4. 0. 0. 0. 4. 4. 0. 0. 4.] [ 3. 0. 0. 0. 2. 0. 0. 2. 2. 2.] [ 3. 0. 0. 3. 0. 0. 0. 5. 3. 3.]]
PMF User features = [[ 2.06057492e+09 2.64891153e+09 2.86497574e+09 2.64238426e+09 1.97686890e+08] [ 5.08472373e+11 6.04756314e+11 6.55789219e+11 6.06024662e+11 2.58378740e+10] [ 1.43217491e+11 1.70217891e+11 1.84440689e+11 1.70782966e+11 7.51060279e+09] [ -3.54408339e+11 -4.21138916e+11 -4.50022123e+11 -4.16972027e+11 -1.58716224e+10] [ 1.25354045e+11 1.49843490e+11 1.59335366e+11 1.48313616e+11 6.10142119e+09] [ 5.57590359e+11 6.55247906e+11 7.09811172e+11 6.58969440e+11 2.72385918e+10] [ 1.10902936e+12 1.35206384e+12 1.42385189e+12 1.31925897e+12 5.48787289e+10] [ -9.02146879e+10 -1.07988678e+11 -1.14775227e+11 -1.06669578e+11 -4.27041390e+09] [ 1.79588513e+11 2.16873310e+11 2.27484628e+11 2.11714541e+11 8.36376693e+09] [ -1.09751053e+08 -1.32740617e+08 -1.39084452e+08 -1.29243876e+08 -4.91920490e+06]]
PMF Item features = [[ 1.79204503e+12 2.83587799e+12 2.46328534e+12 1.64701623e+12 8.08031302e+11] [ 1.29541803e+11 2.04106230e+11 1.75657276e+11 1.26349620e+11 6.34080400e+10] [ 1.58969584e-02 1.10375141e-02 6.56329589e-02 1.38182951e-02 1.96582362e-02] [ 4.74562751e+11 7.66077795e+11 6.54383582e+11 4.31967580e+11 2.04746924e+11] [ 3.61667507e+10 5.39737826e+10 4.81834047e+10 3.49728742e+10 1.89680925e+10] [ 7.79977021e+10 1.22746205e+11 1.06911043e+11 7.11300707e+10 3.51876601e+10] [ 3.54655730e+09 5.38233214e+09 4.81937265e+09 3.39274226e+09 1.78283828e+09] [ -6.08947669e+10 -9.77807886e+10 -8.41433104e+10 -5.50214773e+10 -2.62440637e+10] [ 2.59666061e+11 4.11624835e+11 3.61331526e+11 2.29902189e+11 1.10855005e+11] [ 9.57120729e+07 1.43628497e+08 1.24285898e+08 9.35479098e+07 5.07232946e+07]]
If I multiply these two latent features vectors, I don't obtain the initial ratings matrix, but a very different matrix. Why this problem? 3) The number of features which I can choose to extract, depends on some factors (e.g. the rank of the ratings matrix) or I can choose whatever values?
Thanks for the attention and your answer to these three my questions!
I normalize the rating (so the rating distribution can have 0 mean.) before running the algorithm. https://github.com/chyikwei/recommend/blob/master/recommend/pmf.py#L64
And add it back in prediction: https://github.com/chyikwei/recommend/blob/master/recommend/pmf.py#L133
It is still possible to have negative ratings or ratings greater than max value since we don't have any constraints on the rating distribution, so I add a limit in the end of prediction. https://github.com/chyikwei/recommend/blob/master/recommend/pmf.py#L135-L139
For (2): In recommendation system, the input ratings matrix is "incomplete". The object function is to minimize the difference between "observed" points and reconstructed ratings from latent features. (it is not trying to rebuild your initial ratings matrix.)
For (3): num of features can be any positive integer.
For question #1, if I want to have the "correct" values inside the users and items latent features, I've to add the mean value of ratings (pmf.userfeatures = pmf.user.features_ + pmf.meanrating, and do the same for items latent features) or the values inside the latent features are already "correct"?
I will use these two latent features vectors after training the PMF model (as the example below): pmf.fit(training_set, n_iters=evaluation_iterations) training_predictions = bpmf.predict(training_set[:, :2]) training_rmse = RMSE(training_predictions, training_set[:, 2]) testing_predictions = bpmf.predict(testing_set[:, :2]) testing_rmse = RMSE(testing_predictions, testing_set[:, 2]) print("After %d iterations, training phase RMSE: %.6f, testing phase RMSE: %.6f" % ( evaluation_iterations, training_rmse, testing_rmse))
x = pmf.userfeatures print "PMF User features dimensions = " + str(x.shape) print "PMF User features = " + str(x)
Thanks for your answer!
No. you should not add mean rating to features matrix directly.
Prediction for user i
, item j
is (user_features[i] * item_features[j]) + mean rating
please prediction function: https://github.com/chyikwei/recommend/blob/master/recommend/pmf.py#L133
Hi, I've a question: why if I multiply item_features and user_features I don't obtain the initial matrix which I give as input to PMF (or BPMF) model? Is it a matrix factorization technique, no? If so, it should be true that the multiplication of that elements gives the initial matrix.