zenogantner / MyMediaLite

recommender system library for the CLR (.NET)
http://mymedialite.net
499 stars 192 forks source link

Randomize ordering of candidate_items in ItemRecommender evaluations #425

Closed jcnewell closed 10 years ago

jcnewell commented 11 years ago

The prediction values in ItemAttributeKNN are usually a set of discrete values. This means that many items share the same prediction values. In the current implementation of MyMediaLite.Eval.Items, items which share the same prediction value are ranked according to the order they are found in candidate_items. If this order is changed the evaluation results can change significantly.

Proposed solution - randomise the order of candidate_items before evaluating each user in MyMediaLite.Eval.Items. This gives results which are not sensitive to the ordering of items in the training and test data.

zenogantner commented 10 years ago

Hi Chris, sorry for answering so late on this -- my notification settings were not so optimal, I have changed this now.

This has been done for --overlap-items and --all-items since version 3.08. Do you think it should be done in all cases?

jcnewell commented 10 years ago

Hi Zeno,

You wrote:

This has been done for --overlap-items and --all-items since version 3.08. Do you think it should be done in all cases?

Oops - sorry. I'm obviously not up-to-date with the latest version.

I always use overlap-items so that's fine for me.

BTW a while ago you mentioned you were about to start structural changes so I held off updating the Java port. Is this phase past or should I wait awhile? (I'm not in a hurry)

Chris


Chris Newell Lead Technologist Internet Research & Future Services BBC Research & Development Tel: +44 (0)303 040 9747 Mobile: +44 7732 618410 Skype: jcw.newell

zenogantner commented 10 years ago

See branch new_new_backend. Not finalized yet, but I think we are getting there by the end of the year.