Hey,
After reading issues number #320 and #497. I was creating 3 different models.
Train a model with the user and item features.
Train a model only with user features
Train a model only with item features
For now, on the user features, I've got 1 feature: "country"
on the item features, I've got 2 features: "gender", "category""
To keep it simple, let's say that on country I have 3 values : "c1","c2","c3"
gender : "m", "f"
category: "a", "b", "c", "d"
All the features are categorical and to fit the values to the dataset I call :
dataset.fit( users=user_features['user_id'].unique(), items=item_features['item_id'].unique(), item_features=["m", "f","a", "b", "c", "d"], user_features=["c1","c2","c3"] )
To build the item features for the model:
item_tuple = ((1, ['f', 'a']), (2, ['m', 'b'])...)item_features_m = dataset.build_item_features(item_tuple)
To build the user features for the model:
user = ((1, [ 'c1']), (2, ['c2'])...)user_features_m = dataset.build_user_features(user_tuple)
Then, I call the fit function of a warp lightfm model:
And my weird problem is when I added the user_features all the users got almost the same prediction. (to 100,000 users got only 16 unique items from 4,000 optional items)
When I remove the user features and trained without it's much better (to 100,000 users got 2,000 unique items from 4,000 optional items)
But in the future, I want to use more features that will maybe classify better than matrix factorization, and the features will help me to predict cold-start user prediction.
Hey, After reading issues number #320 and #497. I was creating 3 different models.
For now, on the user features, I've got 1 feature: "country" on the item features, I've got 2 features: "gender", "category""
To keep it simple, let's say that on country I have 3 values : "c1","c2","c3" gender : "m", "f" category: "a", "b", "c", "d"
All the features are categorical and to fit the values to the dataset I call :
dataset.fit( users=user_features['user_id'].unique(), items=item_features['item_id'].unique(), item_features=["m", "f","a", "b", "c", "d"], user_features=["c1","c2","c3"] )
To build the item features for the model:
item_tuple = ((1, ['f', 'a']), (2, ['m', 'b'])...)
item_features_m = dataset.build_item_features(item_tuple)
To build the user features for the model:
user = ((1, [ 'c1']), (2, ['c2'])...)
user_features_m = dataset.build_user_features(user_tuple)
Then, I call the fit function of a warp lightfm model:
And my weird problem is when I added the user_features all the users got almost the same prediction. (to 100,000 users got only 16 unique items from 4,000 optional items)
When I remove the user features and trained without it's much better (to 100,000 users got 2,000 unique items from 4,000 optional items)
But in the future, I want to use more features that will maybe classify better than matrix factorization, and the features will help me to predict cold-start user prediction.
Any answer and advice will help me.
Thank you all