Open andvikt opened 5 months ago
@andvikt Thank you for your suggestions ! Since Sequential Recommendation models in RecBole are formatted to predict the next item(only one) in the sequence, the calculation of our precision@k metrics may not perform well under this circumstance. We will discuss about the optimization method in following updates.
Describe the bug I noticed that my model (SASRec) has very good recall@10 metric, but poor precision@10, recall@10 is 70% and precision@10 is 7%. I could not believe that because when evaluating by hand, my model performed very well with precision metric also.
Then I tried to figured out how exactly those metrics are calculated and as I can judge for now, there may be a bug.
I will try to explain. The first thing i noticed here:
here i can see that the shape of
result
is (batch_size, maxTopK). But in case of sequential dataloader this result will always has at most only one positive item in a row because positive_u is just antorch.arange
of batch_size:All in all this leads us to poor precision results because precision then is calculated as sum of true_positives, that will always be at most 1, but in top10 calculation it must be at most 10.
I think for seqeuntial recomendations it will be better to predict topK items by appending K mask tokens to the end of sequence and evaluating it on K last interactions, but now it is actually evaluated on one item at a time (mask one item, evaluate on 1 positive). When we evaluate only on one positive, we can never has precision@>1 be 100% simply because if we predict 2 items, but user has only one positive, precision@2 will be at most 50%.
Thank you for your work and sorry if i missunderstood something in your code and made wrong assumptions.