tensorflow / ranking

Learning to Rank in TensorFlow
Apache License 2.0
2.74k stars 477 forks source link

Ranking products on listing page. E-commerce #187

Closed NataliyaDi closed 4 years ago

NataliyaDi commented 4 years ago

Hello,

I'd like to test tfr for ranking products on listing page. My data is presented by example features (price, number of sales for the past period, popularity etc) and some context features like city, device, date. My target has 4 values: 0 - no actions due the session with the product, 1- click=swith to detail impression page for product, 2- add to cart, 5- purchase. It is also okay if target will be binary (click/no click). I sort my data by session id key, then make a libsvm format. The amount of products per one session can be different, but always >= 12 (no more than 12 products can be on one page).

I cut my data to train, validation and postponed test. Then pass a script by bazel as in LIBSVM example . Set --num_features=189 --num_train_steps=10000 --train_batch_size=64 --list_size=36 --loss="neural_sort_cross_entropy_loss" --group_size=36. Try different values of loss (list_mle_loss, approx_ndcg_loss, neural_sort_cross_entropy_loss) and group_size (2,12,36). I'm interested in ndcg@k metric.

So script run without problems, but i got a terrible metrics: metric/ndcg@12 = 0.005347632, metric/ndcg@3 = 0.0032467463, metric/ndcg@6 = 0.0043196953, metric/ndcg@9 = 0.00504953, metric/ordered_pair_accuracy = 0.50526315.

Could you tell me what i'm doing wrong? which parameters set incorrectly?

xuanhuiwang commented 4 years ago

Do you have at least 1 positive (like click) for each listing? If there is no positive, the metric for that listing will always be 0. You can safely remove them from your training data.

Also, please use group_size=1 to start.

NataliyaDi commented 4 years ago

Thanks, it helped. Did I understand correctly that in the article Learning Groupwise Multivariate Scoring Functions Using Deep Neural Networks at the figure 1 was used group_size=2 and interactions between each pair of units are tracked?

So if I use group_size=1 in my task, I'm losing the effect of the products interaction, aren't I?

xuanhuiwang commented 4 years ago

You are right. But it is not always helpful to use group_size > 1. To start, we usually use group_size = 1.