Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
XGBoost supports different ranking objectives based on LambdaMART, including rank:pairwise, rank:ndcg and rank:map. The rank:pairwise is unscaled version of ranknet's cost, which means the \delta term in LambdaMART is just set to constant . Other 2 are extensions of LambdaMART with different measures. To clarify the ranking model implementation and make room for other objectives, I propose the following changes:
Rename the objective functions
Use lambdamark:extension for existing objectives. All 3 objectives are pairwise approach as lambdarank/lambdamart is pairwise LTR approach. The name rank:pairwiserank:ndcg is just confusing. Also with recent advancement in LTR, there are revised lambdamart algorithms like the one mentioned in https://github.com/dmlc/xgboost/issues/6143 , so we should make room for future algorithm additions.
rank:pairwise -> lambdamart:constant.
rank:ndcg -> lambdamart:ndcg
rank:map -> lambdamart:map
Clarify and test the input and output space.
The input space and output space for pairwise objectives should be feature values and relevance degree. Internally xgboost constructs pairs of model scores to compute lambda gradients. For map, the label should be binary values as map doesn't work on multi-degree relevance. There's an extension to map called gap that supports multi-degree relevance, but it's out of the scope of this issue. For ndcg, the label should be strictly integers as the ndcg gain in xgboost is calculated as 2^rel_i -1. Another method is use user input gain in place of this one.
More features to existing objectives
We should add class balancing to rank:ndcg to combat imbalanced target relevance degree. Also, truncation should be explicitly specified to rank:ndcg. Lastly, we may support the second ndcg gain type mentioned above.
Change default evaluation metrics
The default evaluation metric for rank:ndcg should be changed to ndcg instead of the current map. With the truncation mentioned in above, the default metric should be ndcg@truncation.
Change implementation details
Right now the implementation deviates from standard algorithm described in related literature. I want to revise the implementation to get the code closer to descriptions in well known literature. I have a prototype of revised rank:ndcg running on CPU, which has better accuracy than existing one even when truncation is set to 1. Also, there are some optimizations we can perform like caching IDCG values for each query group.
XGBoost supports different ranking objectives based on
LambdaMART
, includingrank:pairwise
,rank:ndcg
andrank:map
. Therank:pairwise
is unscaled version of ranknet's cost, which means the\delta
term inLambdaMART
is just set to constant . Other 2 are extensions ofLambdaMART
with different measures. To clarify the ranking model implementation and make room for other objectives, I propose the following changes:Rename the objective functions
Use
lambdamark:extension
for existing objectives. All 3 objectives arepairwise
approach as lambdarank/lambdamart is pairwise LTR approach. The namerank:pairwise
rank:ndcg
is just confusing. Also with recent advancement in LTR, there are revised lambdamart algorithms like the one mentioned in https://github.com/dmlc/xgboost/issues/6143 , so we should make room for future algorithm additions.rank:pairwise
->lambdamart:constant
.rank:ndcg
->lambdamart:ndcg
rank:map
->lambdamart:map
Clarify and test the input and output space.
The input space and output space for
pairwise
objectives should be feature values and relevance degree. Internally xgboost constructs pairs of model scores to compute lambda gradients. Formap
, the label should be binary values as map doesn't work on multi-degree relevance. There's an extension tomap
calledgap
that supports multi-degree relevance, but it's out of the scope of this issue. Forndcg
, the label should be strictly integers as the ndcg gain in xgboost is calculated as2^rel_i -1
. Another method is use user input gain in place of this one.More features to existing objectives
We should add class balancing to
rank:ndcg
to combat imbalanced target relevance degree. Also,truncation
should be explicitly specified torank:ndcg
. Lastly, we may support the secondndcg
gain type mentioned above.Change default evaluation metrics
The default evaluation metric for
rank:ndcg
should be changed tondcg
instead of the currentmap
. With thetruncation
mentioned in above, the default metric should bendcg@truncation
.Change implementation details
Right now the implementation deviates from standard algorithm described in related literature. I want to revise the implementation to get the code closer to descriptions in well known literature. I have a prototype of revised
rank:ndcg
running on CPU, which has better accuracy than existing one even whentruncation
is set to 1. Also, there are some optimizations we can perform like cachingIDCG
values for each query group.