tensorflow / recommenders

TensorFlow Recommenders is a library for building recommender system models using TensorFlow.
Apache License 2.0
1.84k stars 275 forks source link

Giving recommendations to new user based on model's current status. #306

Open Coding-daredevil opened 3 years ago

Coding-daredevil commented 3 years ago

Every tutorial I looked into gives recommendations based on users in the dataset. What if a new one appears? Lets say it is a case where user AA (model during training took data from users A->Z) made some searches/ratings (similarly we can build the Dataframe with user_id, movie_title, user_rating, etc).

If an unknown user is simply entered into the model then (obviously) the model won't be of much use. What I've been thinking is to take in the additional information and using whatever we have at the moment find the most similar user (typical CF) and make the initial query based off that one.

Is there any way to implement this? I'd like to pass the mini-dataframe (with the new user's ratings) as an input of sorts but I am not exactly certain how I should handle it afterwards. How can I add this additional information to my model?

maciejkula commented 3 years ago

This is a common use case. Please have a look at this paper for some ideas.

Coding-daredevil commented 3 years ago

This is a common use case. Please have a look at this paper for some ideas.

Apologies, I'm kinda new to this and I'm thrown off balance occasionally. Basically I've tried a few things on my own which didn't work out. I've seen this paper as well, but I might have misjudged it (as a more theoretical read). What I did try was making a string of my item IDs instead of the user id and used text vectorization rather than StringLookup, but it still only seems to accept identical inputs (otherwise extremely low scores). Perhaps I'm understanding something wrong.

maciejkula commented 3 years ago

You can have a look at https://github.com/tensorflow/recommenders/issues/119 for some more details about how this can be implemented.

The basic idea is to use a sequence of item ids, then process it much like a language model.

Coding-daredevil commented 3 years ago

You can have a look at #119 for some more details about how this can be implemented.

The basic idea is to use a sequence of item ids, then process it much like a language model.

Hm, that's where I am at. I think I might have failed due to using strings (e.g. an item id may be 'xxxxxxx//xxxxx') so I would have something like this ['xxxxxxxx//xxxx', 'xxxxxxxx//xxxxx', etc]. But for this to go from a pandas.Series to a tensor is a bit strange and I ended up transforming it into a string 'xxxxxxxx//xxxxx xxxxxxxxx//xxxxx' with simple spaces differentiating the items, which I thought I would tokenize afterwards with TextVectorization. think this is where I'm losing it, as the example you've sent me (I had looked into it) uses arrays. I might try something similar, see how exactly he handles his data and work likewise.

Thanks a lot for answering, I know I am just looking oblivious :p

maciejkula commented 3 years ago

Your questions make sense - consider moving from pandas to something more suitable, like tf.data.

Pandas is a good starting place for exploratory data analysis but isn't particularly well suited for tasks like this.

Coding-daredevil commented 3 years ago

Your questions make sense - consider moving from pandas to something more suitable, like tf.data.

Pandas is a good starting place for exploratory data analysis but isn't particularly well suited for tasks like this.

Thanks, basically I do that through this manner:

tf_ratings = tf.data.Dataset.from_tensor_slices((dict(ratings))) tf_products = tf.data.Dataset.from_tensor_slices((dict(products)))

And I can easily proceed as guided by the tutorial.