recommenders / rival

RiVal recommender system evaluation toolkit
rival.recommenders.net
Apache License 2.0
150 stars 40 forks source link

[Question] Use precision and recall metrics for implicit feedback evaluation #89

Closed aleSuglia closed 9 years ago

aleSuglia commented 9 years ago

As you can see in the title, this is not effectively an issue related to Rival, but it's a question that I want to ask to you in order to understand how should I use the tool for my task.

I'm implementing a recommender system that should complete a top-n recommendation task in an implicit feedback context. What I mean by implicit feedback context is that I only know from my dataset that a user "likes" an item, nothing more (my dataset contains tuples in the form (user, item)).

So I've decieded to construct a DataModel associating to each tuple (u_m, i_k) in the dataset a preference for the user u_m to the item i_k equal to 1. This is an abstraction of the code that I use:

   foreach (user,item) in the dataset:
          model.addPreference(user, item, 1d);

After that, I construct a data model that contains at most N elements for each user, according to the top-n recommendation task. In this case, I associate 1 as a preference for each item that is present in the recommendation list for a specific user.

I create a precision and recall object using their main constructor that receives as parameters the predictions model and the test set model. When I compute precision and recall, I got NaN values when I call getValueAt() and I'm starting to think that I'm doing something wrong. Also the per-user metrics are equal to NaN.

Can you help me in solving this?

Thank you in advance.

abellogin commented 9 years ago

Hi @aleSuglia, if I have understood your question, you want to use RiVal for "implicit feedback". This is currently not supported, although we aim to do it in the future. The first step (addPreference) you describe makes sense. The second step (construct data model with topN elements) would make sense if this is the output of a recommender, is this what you mean? In that case, the preference you store there should not be too important, as long as you change the relevance threshold passed to the evaluation metrics. Finally, regarding the NaN values you mention, please, check if you are calling the method compute() prior getValue().

Regards, Alejandro

aleSuglia commented 9 years ago

What I do in the second step is: compute, for each user, a list of at most N element ranked using a score that is produced by the algorithm that I've implemented; for each of these items, I added a preference (with value 1) in the predictions model.

I always call compute() before calling getValue(), but I always get NaN. This is a brief example of my code in which I need to compute cumulative precision and recall for each fold:

       DataModel<String, String> predictions = Utils.loadData(),
                                                  testData = Utils.loadData();

       Precision<String, String> precision = new Precision<>(predictions, testData);
       Recall<String, String> recall = new Recall<>(predictions, testData);
       precision.compute();
       recall.compute();
       // cutoff = 10
       cumulativePrecision += precision.getValueAt(cutoff);
       cumulativeRecall += recall.getValueAt(cutoff);

Is there anything wrong?

EDIT: Looking at the code used in the Precision and Recall object, I've seen that I need to specify in the constructor a relevance threshold and a cutoff array. For the cutoff array, I use an array of int that contains only 10 but for the relevance threshold what should I use in my case?

abellogin commented 9 years ago

Yes, that is what I was going to suggest, to include the array of cutoffs. Your relevance threshold should be 1, since you want any item that appears in the test set to count as relevant.

aleSuglia commented 9 years ago

Actually the precision values are around 7.173601147776183E-5 for each fold. So I think that my algorithm runs pretty bad. Is there something else that I should fix in order to correctly test my results?

abellogin commented 9 years ago

I would suggest you test two recommenders: a random one and another one that always recommends the most popular items. These two should give you useful reference scores.

aleSuglia commented 9 years ago

Thank you for your help.

What should you need in order to support implicit feedback evaluation?

alansaid commented 9 years ago

Implicit feedback is still recorded as an interaction, thus the functionality isn't related to implicit feedback per se, rather with unary/binary data.

Having support for this would be great, thus we can answer the question with a yes and create issues for adding the functionality instead.

abellogin commented 9 years ago

I have created the following issues: #103, #104, #105, #106.