benfred / implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets
https://benfred.github.io/implicit/
MIT License
3.55k stars 611 forks source link

How do users with a single item interaction affect the predictions for new users? #393

Open DaStapo opened 4 years ago

DaStapo commented 4 years ago

If I use my model exclusively for new users, should I include the users, who only have had one product interaction, in the training set?

I know this is a theoretical question unrelated to the implementation.

cakedev0 commented 4 years ago

For models based purely on Collaborative Filtering (as all the models of this package), users who only have had one product interaction are useless in the training set. More precisely, if you consider the bipartite graph of user-item interactions, you can prune it until all nodes (users or items) have a degree of at least 2.

Besides, how many interactions do new users have? If it's only one or two, you should probably consider models that leverage users features (if you have some... it can be age/gender/OS/marketing channel/...). You can take a look at https://github.com/lyst/lightfm but I really don't how if it's a good library or not.

DaStapo commented 4 years ago

My suspicion was that the general popularity of an item would have an effect in cases where there is very little overlap in interactions.