Open Coding-daredevil opened 3 years ago
This is a common use case. Please have a look at this paper for some ideas.
This is a common use case. Please have a look at this paper for some ideas.
Apologies, I'm kinda new to this and I'm thrown off balance occasionally. Basically I've tried a few things on my own which didn't work out. I've seen this paper as well, but I might have misjudged it (as a more theoretical read). What I did try was making a string of my item IDs instead of the user id and used text vectorization rather than StringLookup, but it still only seems to accept identical inputs (otherwise extremely low scores). Perhaps I'm understanding something wrong.
You can have a look at https://github.com/tensorflow/recommenders/issues/119 for some more details about how this can be implemented.
The basic idea is to use a sequence of item ids, then process it much like a language model.
You can have a look at #119 for some more details about how this can be implemented.
The basic idea is to use a sequence of item ids, then process it much like a language model.
Hm, that's where I am at. I think I might have failed due to using strings (e.g. an item id may be 'xxxxxxx//xxxxx') so I would have something like this ['xxxxxxxx//xxxx', 'xxxxxxxx//xxxxx', etc]. But for this to go from a pandas.Series to a tensor is a bit strange and I ended up transforming it into a string 'xxxxxxxx//xxxxx xxxxxxxxx//xxxxx' with simple spaces differentiating the items, which I thought I would tokenize afterwards with TextVectorization. think this is where I'm losing it, as the example you've sent me (I had looked into it) uses arrays. I might try something similar, see how exactly he handles his data and work likewise.
Thanks a lot for answering, I know I am just looking oblivious :p
Your questions make sense - consider moving from pandas to something more suitable, like tf.data
.
Pandas is a good starting place for exploratory data analysis but isn't particularly well suited for tasks like this.
Your questions make sense - consider moving from pandas to something more suitable, like
tf.data
.Pandas is a good starting place for exploratory data analysis but isn't particularly well suited for tasks like this.
Thanks, basically I do that through this manner:
tf_ratings = tf.data.Dataset.from_tensor_slices((dict(ratings))) tf_products = tf.data.Dataset.from_tensor_slices((dict(products)))
And I can easily proceed as guided by the tutorial.
Every tutorial I looked into gives recommendations based on users in the dataset. What if a new one appears? Lets say it is a case where user AA (model during training took data from users A->Z) made some searches/ratings (similarly we can build the Dataframe with user_id, movie_title, user_rating, etc).
If an unknown user is simply entered into the model then (obviously) the model won't be of much use. What I've been thinking is to take in the additional information and using whatever we have at the moment find the most similar user (typical CF) and make the initial query based off that one.
Is there any way to implement this? I'd like to pass the mini-dataframe (with the new user's ratings) as an input of sorts but I am not exactly certain how I should handle it afterwards. How can I add this additional information to my model?