lyst / lightfm

A Python implementation of LightFM, a hybrid recommendation algorithm.
Apache License 2.0
4.73k stars 691 forks source link

The expected workflow for mapping the external id's while doing predict #515

Closed zkid18 closed 4 years ago

zkid18 commented 4 years ago

According to the source code the expected inout format for predict function is following:

Arguments
        ---------
        user_ids: integer or np.int32 array of shape [n_pairs,]
             single user id or an array containing the user ids for the
             user-item pairs for which a prediction is to be computed. Note
             that these are LightFM's internal id's, i.e. the index of the
             user in the interaction matrix used for fitting the model.
        item_ids: np.int32 array of shape [n_pairs,]
             an array containing the item ids for the user-item pairs for which
             a prediction is to be computed. Note that these are LightFM's
             internal id's, i.e. the index of the item in the interaction
             matrix used for fitting the model.

What is the expected way for mapping external and internal id's on the inference stage? For now I managed to use the private variable dataset._user_id_mapping which I don't consider the proper way to solve this case

#external id
client_id = 14092
#internal LightFM id
dataset._user_id_mapping[client_id]
SimonCW commented 4 years ago

Hi, if you use lightfm.data.Dataset to build your interactions and feature matrices (which you should), then you can use Dataset.mapping to convert back and forth.

For my application, I wrote a wrapper around LightFM to map between internal and external ids but for large datasets you have to consider performance losses caused by lots of back and forth translation.

zkid18 commented 4 years ago

Hi Simon! Thanks for the comment, I didn't notice Dataset.mapping. Would you mind sharing your wrapper?

SimonCW commented 4 years ago

Sorry, I cannot do that because there is customer logic in there. @zkid18 I propose to close.

zkid18 commented 4 years ago

@SimonCW I'll close the PR and later attach my implementation of mapper. Also I'll appreciate if someone would share their mapping as well.