Closed budbuddy closed 3 weeks ago
Yeah this is a bug. self.n_items
during retraining is the total item number, which is likely to be bigger than the item number in the new training set, given the new training set may contain new items.
I will release a new version to fix it. Do you have any other problems?
No, everything else I've tried pretty much works, although I haven't tested every model. I've been mainly using TwoTower and DIN, with some experimentation with DeepFM and YoutubeRanking.
I've checked with my model and data, and the issue has been fixed.
I tried retraining a TwoTower model on new data, and get this error:
As you can see, this error comes from an assertion in
model.fit
specifically in the case wheresoftmax
is chosen as the loss metric. I've been racking my brains over this one, but can't find a fix.len(item_counts)
is the number of different items in the new training set andself.n_items
is the number of item embeddings in the model, so realistically these cannot be equal when retraining, unless using the entire dataset. I also thought about changing the loss but I don't get very good metrics on the other losses.