The model is evaluated on sampled negatives in the example. I can generate similar result as reported in the paper on the sampled dataset. But when I try to evaluate it on the whole test data with all items as candidate, the HR@10 will drop to around 3%. Any advice on that?
The model is evaluated on sampled negatives in the example. I can generate similar result as reported in the paper on the sampled dataset. But when I try to evaluate it on the whole test data with all items as candidate, the HR@10 will drop to around 3%. Any advice on that?