How to use different Ranking algorithms

Dinesh-Mali commented 5 years ago

Hello, I am using TREC dataset (from Letor) for TF ranking. I have several questions please guide me to understand the concepts:

1) How to use different ranking algorithms in TF Ranking? In the example notebook, I am getting the final results but don't have any idea about which ranking algorithm is being used?

2) At the last It gives some losses for a particular metrics. How do I get the scores instead or only the relevance between a query and given documents for that particular query so that I can judge the relevant docs for that?

3) In the notebook example, reference to MSLR-Web30k has given and _NUM_FEATURES is set to 136. But I have my own documents and queries and created only 9 features, So when I trains it on my features where _NUM_FEATURES = 9, It gives me InvalidArgumentError. Please can you explain about this?

Thanks!!

ramakumar1729 commented 5 years ago

Hi Dinesh,

We mainly support the groupwise ranking model, described here. In the demo code, this corresponds to a pointwise scoring function, with a pairwise logistic loss. By changing the group_size parameter, you can control the number of documents that interact to generate a score. By changing the loss function, you can choose a pointwise/pairwise/listwise loss.
To just get the scores, once you build a ranking estimator, like in this part of the demo code, you can use the estimator.predict function to obtain scores.
Can you share more details on the error? What is the command you ran, a snippet of the input data, and the error itself?

Dinesh-Mali commented 5 years ago

Hi Rama, Thanks for the explaination.

I solved third problem, but as you told in first explanation, as group_size parameters can be changed to work on multi item scoring. In order to do that I set group_size parameter to 2, but now its giving me an IllegalArgumentError as follows: InvalidArgumentError: Input to reshape is a tensor with 144 values, but the requested shape has 288 [[{{node groupwise_dnn_v2/accumulate_scores/Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](groupwise_dnn_v2/group_score/dense_2/BiasAdd, groupwise_dnn_v2/accumulate_scores/Reshape/shape)]]

My input_fn is as follows: def input_fn(path): train_dataset = tf.data.Dataset.from_generator( tfr.data.libsvm_generator(path, _NUM_FEATURES, _LIST_SIZE ), output_types=( {str(k): tf.float32 for k in range(1,_NUM_FEATURES+1)}, tf.float32 ), output_shapes=( {str(k): tf.TensorShape([_LIST_SIZE, 1]) for k in range(1,_NUM_FEATURES+1)}, tf.TensorShape([_LIST_SIZE]) ) ) np.random.seed(25) train_dataset = train_dataset.shuffle(1000).batch(_BATCH_SIZE)

return train_dataset.make_one_shot_iterator().get_next()

Please can you explain the reason for error?

xuanhuiwang commented 5 years ago

If you set group_size = 2, your group_score_fn is expected to return a logits of shape [batch_size, 2]. I guess you current one is [batch_size, 1].

tensorflow / ranking

How to use different Ranking algorithms #61