Wrong precision metric - Githubissues

kolomietsdv commented 6 years ago

Ben, hello! Thank you for you cool library!

Please can you make an option for clean precision evaluation.

In evaluation module you choose maximum between precision/recall metric evaluating precision@k using this line: total += fmin(K, likes.size())

Why do you do that? Can you provide an option for (total = K) in your precision@k function?

Thank you!

benfred commented 6 years ago

The idea with that is that if you're asking for P@10 - and say that no user has more than 5 items in the test set, then the best possible value is 0.5 (no matter how good the recommendations are). This scales it back up so that the max score is 1.0 , no matter what your test data distribution is like. I made this change because I think it's more useful in terms of using this evaluation function for hyper-parameter optimization basically.

Are you trying to compare this with p@k implementations in other packages?

kolomietsdv commented 6 years ago

Exactly. I want to compare this using p@k and map@k. And I can't change metric evaluation in both cases) Also, it will be good idea to have and option to evaluate precision with different k's using 1 iteration. Thank you for your answer.

benfred commented 6 years ago

There are other issues with comparing the metrics reported here with those from other packages =(.

I'm removing the training set liked items from the results when evaluating, and many other packages don't. I think this is correct to do, because leaving them in skews the results in a weird way.

For comparing across recsys packages, I usually just create a shim to adapt. Like with this code you can create a lightfm MF model that you can run through the evaluation code here:

import numpy as np
import multiprocessing

from implicit.recommender_base import MatrixFactorizationBase

class LightFMAdaptor(MatrixFactorizationBase):
    def __init__(self, epochs=20, num_threads=0, *args, **kwargs):
        super(LightFMAdaptor, self).__init__()

        # create a LightFM model using the supplied parameters
        from lightfm import LightFM
        self.model = LightFM(*args, **kwargs)

        self.epochs = epochs
        self.show_progress = True
        self.num_threads = num_threads or multiprocessing.cpu_count()

    def fit(self, item_users):
        # fit the wrapped model
        self.model.fit(item_users.T.tocoo(), 
                       num_threads=self.num_threads,
                       epochs=self.epochs,
                       verbose=self.show_progress)

        # convert model attributes back to this class, so that
        # the recommend/similar_items etc calls on the base class will work
        items, users = item_users.shape
        self.user_factors = np.concatenate((self.model.user_embeddings,
                                            self.model.user_biases.reshape(users, 1),
                                            np.ones((users, 1))), axis=1).copy()
        self.item_factors = np.concatenate((self.model.item_embeddings,
                                            np.ones((items, 1)),
                                            self.model.item_biases.reshape(items, 1)),
                                            axis=1).copy()

benfred / implicit

Wrong precision metric #158