biolab / orange3-recommendation

🍊 :thumbsdown: Add-on for Orange3 to support recommender systems.
Other
25 stars 15 forks source link

How to obtain the benchmarked RMSE? #7

Open hhchen1105 opened 7 years ago

hhchen1105 commented 7 years ago

I tried to reproduce the RMSEs of ml-100k reported in the benchmarks. However, the number I got is 0.91474, far worse than the reported score 0.810 (I even used the training rmse). May I ask how to get the benchmark numbers? Here is my code:

data = Orange.data.Table('movielens100k.tab') 
brismf = BRISMFLearner(num_factors=15, num_iter=15, learning_rate=0.07, lmbda=0.1) #paras reported on the benchmarks page
recommender = brismf(data)
y_pred = recommender(data)
rmse = math.sqrt(mean_squared_error(data.Y, y_pred))
print(rmse)

Thanks.

lanzagar commented 7 years ago

I tried running your code. Didn't know where you imported mean_squared_error from, so instead I used rmse = math.sqrt(((data.Y - y_pred)**2).mean()) In 3 runs, I got the following results: 0.8136, 0.8122, 0.8135

Which is still not as good as reported, but better than your results. Not sure why that is.

hhchen1105 commented 7 years ago

I tried running the same code on another machine and got a much better result (0.8133), but in the original machine, I still got 0.91xx. :(

A related issue: on the new machine, I separated the data into 80% training and 20% testing. Based on the same parameter, the test RMSE is 0.9502, much worse than the training RMSE 0.7727. It is probably better to emphasize on the benchmarks the RMSEs are in fact "training" RMSE.