lyst / lightfm

A Python implementation of LightFM, a hybrid recommendation algorithm.
Apache License 2.0
4.77k stars 691 forks source link

Added cold-start example along with instructions to build user/item features #591

Open V-Sher opened 3 years ago

V-Sher commented 3 years ago

Working example to provide an in-depth explanation of how to create user/item features and demonstrate the use of Lightfm for solving the cold start problem. Also attached link to an article to provide further clarity to LightFM users.

SimonCW commented 3 years ago

Hi @V-Sher ,

thanks for the PR. I think it is a good idea to tackle the topic of building user/item features. This part of the library seems a bit unclear to many users. However, I don't think that it would be a good idea to link to a private medium article (that might even be inaccessible behind the paywall).

Maybe you can try focusing the notebook on the issue of creating user/item features and directly add the necessary explanations in the notebook markdown?

Another way would be to improve upon the official example on hybrid / cold-start to illustrate the process of building user features.

V-Sher commented 3 years ago

@SimonCW Ahh, fair point. I will try to add all the necessary explanations in Markdown in the next couple of days!. Thanks for the suggestion.

ankurrajdev commented 3 years ago

What if the user features contain non-binary values such as total orders or avg order value? We would be passing all unique values for that variable in uf argument.

The current example only has binary input.

Would this approach work for that kind of input for user_features?

Also, just wanted to confirm, if we want to evaluate the recommender. The user_matrix that is passed in the training step should be different from the one passed in predict/evaluate step. Else, it can cause data leakage, if we use variables such as avg order value in last x days.