Open longlnnm opened 7 years ago
Can you have a look at the updated documentation for user and item features?
The gist of it is that user and item ids are simply indices into rows of their feature matrices: if you supply feature matrices that have features, you are using features, and the user ids are only a means of telling the model which row of the features matrix it should use.
If my understanding is right, the docs say it's better to provide a userid-userid
identity matrix concatenated with feature matrix. So anyhow the overall user feature matrix would have a unique userid in every row?
Say two users 1 and 2 have two same features, age and height. Given a new user with same age and height to them, what will the model predict? We feed the new age and height, in your reply, which row should the new user feature be in? Or this doesn't matter at all, any row is OK, because their features are exactly the same.
Hope this helps!
My problem is more of an implementation problem.
Supposing if I have a NEW USER D, and D is does not have any user interaction data. Now if I have trained with user A, B, C and have interactions with item 1 to 9: A1, A2, A3, B4, B5, B6, C7, C8, C9. Now say D's has similar user features to A's (age, gender, job...). My goal is to recommend NEW ITEMS 11, 12, 13 with similar features to items 1, 2, 3 from user A. So this is a problem with new user being recommended new items with similar user, and item features.
What item_features and user_features do I pass to the predict method if I want to predict : predict(D.id, item_ids=?, item_features=?, user_features=?)
Is it correct if I make this function: predict(0, item_ids=[11,12,13], item_features=[11's features, 12's features, 13's features], user_features=[D.features])
You're close. Remember, user and item ids are indices into their respective feature matrices. In this case, your item matrix has three rows, so you'd need to do:
predict(0, item_ids=[0, 1, 2], item_features=[11's features, 12's features, 13's features], user_features=[D.features])
Thank you very much. That explains a lot
I have questions about this problem also:
Thanks! :)
@maciejkula I have a few questions to ask. Please help me with these.
In Movie prediction, for predicting recommendations for a new user :- In model.fit(), I pass user_features as concatenated (identity matrix and feature matrix). But for predicting for a new user , We should use model.predict(0, np.arange(n_items) , user_features=user feature matrix of shape (1, len(features)) Here, the user_feature passed in model.predict will be of length of user features but the user_features passed in model.fit() are of length (length of features and identity matrix) . Can you please tell if this is the correct way ?
If the User/item features given like 'age' : 1-90, 'location' : usa, singapore, italy etc. How do I convert them into binary to create a feature matrix ?
I need to perform recommendation on a very small data (users : 5-6 , items: 50-60) and I am not getting good results for such small data . Can you please suggest what can be the minimum users , items , minimum interactions per user , minimum interactions per item, and minimum number of user/item features for a decent result. ?
You're close. Remember, user and item ids are indices into their respective feature matrices. In this case, your item matrix has three rows, so you'd need to do:
predict(0, item_ids=[0, 1, 2], item_features=[11's features, 12's features, 13's features], user_features=[D.features])
@maciejkula, is there a special reason you used 0
for your user_ids? If this is a cold start problem for users, shouldn't you need to add a new index?
@freytheviking, the id must correspond to an index (row) in the user_features (CSR) matrix passed in to predict
(which has num_features columns). So if you construct the user_features matrix to only contain 1 row it will correctly select that row.
If the id does not correspond to a row you will receive the exception *** Exception: Number of user feature rows does not equal the number of users
@mm27368 regarding your second point, there is a Dataset
helper class for doing that. Have a look at the class documentation and this tutorial
Hello, I would like to ask how we can predict using user features instead of user ids. Because having user ids means that lightFM needs to train with that user first. However, if I use user features, I can use training data of user with similar features, and recommend similar items.
Both predict and predict_rank have require use ids. Is there any way I can use user features to predict instead?
predict(user_ids, item_ids, item_features=None, user_features=None, num_threads=1)
predict_rank(test_interactions, train_interactions=None, item_features=None, user_features=None, num_threads=1)