[Question] Best practice for making the model updates with new data?

zhoudoufu commented 3 years ago

I have built a recommender by applying method in tutorial, but in real practice, the training data is not still, new contents will be generated, user will react to your recommendation content.

So I would like to get advice for two points:

What's the best practice for updating the mdl with new user behavior / new content? (Not retraining from beginning)
What's the best practice for involving people's reaction to my recommendation?

Thanks in advance.

ageofneil commented 3 years ago

I'de be interested in this too. https://github.com/tensorflow/recommenders/issues/200 touches on this topic and references the GRU4Rec paper.

If we use an interaction sequence feature, would the model be able to output good recommendations if the input is an existing user id and a new interaction sequence (ex: any new interactions since the the model was trained)?

maciejkula commented 3 years ago

Incremental model training is the common practice. You keep the same model, and train it with new data every day, resuming from where the previous model left off.
For including new items or new users, the following approaches are common:
- Instead of using user ids, model users as functions of their past interactions. GRU4Rec falls in this category.
- Use a hashing approach to have effectively dynamic vocabularies that can adapt to new item ids/other categorical features appearing in the data as the model trains.

zhoudoufu commented 3 years ago

Thanks for your reply @maciejkula ~

Incremental model training is the common practice. You keep the same model, and train it with new data every day, resuming from where the previous model left off.

In real practice, how to add some kind of decay on old data, making the system tends to recommend newly generated item, while keep old classic items(As in the MovieLens dataset, classic films still deserves recommending)? As the tfrs project is quite new, it's hard to find tutorials on it, some code examples will be great.

For including new items or new users, the following approaches are common:

Instead of using user ids, model users as functions of their past interactions. GRU4Rec falls in this category.

Use a hashing approach to have effectively dynamic vocabularies that can adapt to new item ids/other categorical features appearing in the data as the model trains.

I will try both of these suggestion, but I don't quite get the hashing trick, Your link seems adding a hash function to the categorical encoding result,

say you have some new recommending items that are not known to the trained mdl yet, the mdl will predict the hashed item id as an output. Then how to retrieve the real item id then?

or you mean to train the mdl with these unseen items required , using hash trick to item id just helps to keep its shape unchanged?

alimirferdos commented 3 years ago

@zhoudoufu Were you able to add a decay on the old data? Could you share your solution?

yunruili commented 2 years ago

Any code example about how we

load trained model
and retrain on top of this model My question is basically same as https://github.com/tensorflow/recommenders/issues/384

yunruili commented 2 years ago

will try both of these suggestion, but I don't quite get the hashing trick, Your link seems adding a hash function to the categorical encoding result,

s

Could you explain more about hashing approach?

yunruili commented 2 years ago

https://arxiv.org/abs/2108.13299

tensorflow / recommenders

[Question] Best practice for making the model updates with new data? #239