NVIDIA-Merlin / Transformers4Rec

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
https://nvidia-merlin.github.io/Transformers4Rec/main
Apache License 2.0
1.07k stars 142 forks source link

Génerating predictions #759

Open Oussamakhammassi opened 8 months ago

Oussamakhammassi commented 8 months ago

❓ Questions & Help

Details

Hello! I'm working on an e-commerce dataset to predict items. I successfully run the training of the model in the tutorial and now i want to try to generate some results. For example, i want to give him an item and he returns the top_k suggested items from the dataset. How to do that? I tried to use the nextitempredictiontask.forward() but didn't work for me. I would be glad if someone could guide through this
rnyak commented 8 months ago

@Oussamakhammassi are you trying to generate predictions offline or using Triton inference server?

This examples shows how to generate topk items from Triton: https://github.com/NVIDIA-Merlin/Transformers4Rec/blob/main/examples/end-to-end-session-based/02-End-to-end-session-based-with-Yoochoose-PyT.ipynb

Oussamakhammassi commented 8 months ago

Hi @rnyak i did the predictions but now i have another problem. In fact i used my own dataset and my aim is to give the model a single item_id and he predicts the top_k recommended products. for that i used my own dataset containing the same features as yoochoose-clicks dataset. I made the same preproc but the problem is that the model gives the same predictions for all items. I tested it on yoochoose and it worked but when it comes to my own data it dosen't work. I want also to know if it is feasable to predict based on a single item not a sequence of items like i did in my project? if it is what details should i take into consideration to deal with this issue?

rnyak commented 8 months ago

@Oussamakhammassi for train and evaluation steps, you need to provide a sequence with at least 2 interactions (e.g. item-id). But for prediction you can give a sequence with only one interaction. However, if you train your model with other features than item-id-list, you need to provide these features at the inference step as well. You cannot only feed item-id-list if model is trained with multiple features.

rnyak commented 8 months ago

you also need to checkout if you are properly tagging your features. You said, your model gives the same predictions for all items, is your data synthetic data or real dataset? if it is synthetically generated, as we do in some examples, it is normal you get same predictions since the data is not really meaningful, it is only for demonstration purpose. what's not working on your data? you get an error? if yes, what's the error?

Oussamakhammassi commented 8 months ago

Hi @rnyak, i'm using a real dataset from the history of purchases of an online website and i provided the same features used in yoochoose-clicks. The data preprocessing is working and also the training and evaluation, but when it comes to prediction, it generates the same predictions for all items. I did a comparaison between yoochoose and my dataset and the only detail that i could notice is the products popularity. In fact, in yoochoose-clicks the average of number of times a product belong to a transaction is 200 while in my case 45. Could it be because of this factor that the model is not detecting any logic relationship in the data?