tensorflow / recommenders

TensorFlow Recommenders is a library for building recommender system models using TensorFlow.
Apache License 2.0
1.85k stars 279 forks source link

[Question] Best practice for making the model updates with new data? #239

Open zhoudoufu opened 3 years ago

zhoudoufu commented 3 years ago

I have built a recommender by applying method in tutorial, but in real practice, the training data is not still, new contents will be generated, user will react to your recommendation content.

So I would like to get advice for two points:

Thanks in advance.

ageofneil commented 3 years ago

I'de be interested in this too. https://github.com/tensorflow/recommenders/issues/200 touches on this topic and references the GRU4Rec paper.

If we use an interaction sequence feature, would the model be able to output good recommendations if the input is an existing user id and a new interaction sequence (ex: any new interactions since the the model was trained)?

maciejkula commented 3 years ago
  1. Incremental model training is the common practice. You keep the same model, and train it with new data every day, resuming from where the previous model left off.
  2. For including new items or new users, the following approaches are common:
    • Instead of using user ids, model users as functions of their past interactions. GRU4Rec falls in this category.
    • Use a hashing approach to have effectively dynamic vocabularies that can adapt to new item ids/other categorical features appearing in the data as the model trains.
zhoudoufu commented 3 years ago

Thanks for your reply @maciejkula ~

  1. Incremental model training is the common practice. You keep the same model, and train it with new data every day, resuming from where the previous model left off.

In real practice, how to add some kind of decay on old data, making the system tends to recommend newly generated item, while keep old classic items(As in the MovieLens dataset, classic films still deserves recommending)? As the tfrs project is quite new, it's hard to find tutorials on it, some code examples will be great.

  1. For including new items or new users, the following approaches are common:

    • Instead of using user ids, model users as functions of their past interactions. GRU4Rec falls in this category.
    • Use a hashing approach to have effectively dynamic vocabularies that can adapt to new item ids/other categorical features appearing in the data as the model trains.

I will try both of these suggestion, but I don't quite get the hashing trick, Your link seems adding a hash function to the categorical encoding result,

say you have some new recommending items that are not known to the trained mdl yet, the mdl will predict the hashed item id as an output. Then how to retrieve the real item id then?

or you mean to train the mdl with these unseen items required , using hash trick to item id just helps to keep its shape unchanged?

alimirferdos commented 3 years ago

@zhoudoufu Were you able to add a decay on the old data? Could you share your solution?

yunruili commented 2 years ago

Any code example about how we

  1. load trained model
  2. and retrain on top of this model My question is basically same as https://github.com/tensorflow/recommenders/issues/384
yunruili commented 2 years ago

will try both of these suggestion, but I don't quite get the hashing trick, Your link seems adding a hash function to the categorical encoding result,

s

Could you explain more about hashing approach?

yunruili commented 2 years ago

https://arxiv.org/abs/2108.13299