predict_next_batch not considering other products in the same session

hassanmasood1 commented 5 years ago

I am trying to see if model's prediction consider different Items in same session.

I tried batch_size 5, with different session ids [0,1,2,3,4] as well same session ids [0,0,0,0,0] for ItemIds [A1, A2, A3, A4, B1] (shoe, shoe, shoe, shoe, shirt). But for both scenarios predict_next_batch gives same prediction for B1. I am assuming if the session Id is same for input products the model will consider them all in sequential fashion and the prediction will be effected by the combination of Items in the same session.

hassanmasood1 commented 5 years ago

batch_size=5
#session_ids = ['0','1','2','3','4']
session_ids = ['0','0','0','0','0']
input_item_ids = valid.ItemId.values[0:batch_size]
predict_for_item_ids = None

print('session_ids: {}'.format(session_ids))
print('input_item_ids: {}'.format(input_item_ids))
print('predict_for_item_ids: {}'.format(predict_for_item_ids))

preds = gru.predict_next_batch(session_ids, input_item_ids, predict_for_item_ids, batch_size)
preds.fillna(0, inplace=True)
print('Preds: {}'.format(preds))

session_ids: ['0', '0', '0', '0', '0'] input_item_ids: ['44790602' '32528001' '32528400' '32528400' '32528010'] predict_for_item_ids: None

Preds: 0 1 2 3 4 04133139 9.912187e-05 2.520136e-04 9.875565e-04 9.875565e-04 1.449896e-04 15994400 1.708450e-05 3.565932e-05 1.420142e-04 1.420142e-04 7.520352e-06 17418001 3.279215e-06 7.654255e-07 1.834371e-06 1.834371e-06 1.035390e-06 15994700 3.367049e-05 1.172445e-04 3.229178e-04 3.229178e-04 3.718351e-05

hidasib commented 4 years ago

All items in the same session are considered, but the last item has the most impact on the recommendations (especially on the first few places). But earlier items also have some influence and can change the order (and content) of recommendations. However, if there are sharp changes in the session, the model will probably decide to fully ignore previous items, either because this is also supported by the data (e.g. sharp changes is sessions are usually permanent) or because this type of change in focus has never (or rarely) happened in the data and thus during inference time this is the best approach.

hidasib / GRU4Rec

predict_next_batch not considering other products in the same session #36