dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.27k stars 8.72k forks source link

[RFC] Revising ranking objectives. #6450

Closed trivialfis closed 2 months ago

trivialfis commented 3 years ago

XGBoost supports different ranking objectives based on LambdaMART, including rank:pairwise, rank:ndcg and rank:map. The rank:pairwise is unscaled version of ranknet's cost, which means the \delta term in LambdaMART is just set to constant . Other 2 are extensions of LambdaMART with different measures. To clarify the ranking model implementation and make room for other objectives, I propose the following changes:

Rename the objective functions

Use lambdamark:extension for existing objectives. All 3 objectives are pairwise approach as lambdarank/lambdamart is pairwise LTR approach. The name rank:pairwise rank:ndcg is just confusing. Also with recent advancement in LTR, there are revised lambdamart algorithms like the one mentioned in https://github.com/dmlc/xgboost/issues/6143 , so we should make room for future algorithm additions.

Clarify and test the input and output space.

The input space and output space for pairwise objectives should be feature values and relevance degree. Internally xgboost constructs pairs of model scores to compute lambda gradients. For map, the label should be binary values as map doesn't work on multi-degree relevance. There's an extension to map called gap that supports multi-degree relevance, but it's out of the scope of this issue. For ndcg, the label should be strictly integers as the ndcg gain in xgboost is calculated as 2^rel_i -1. Another method is use user input gain in place of this one.

More features to existing objectives

We should add class balancing to rank:ndcg to combat imbalanced target relevance degree. Also, truncation should be explicitly specified to rank:ndcg. Lastly, we may support the second ndcg gain type mentioned above.

Change default evaluation metrics

The default evaluation metric for rank:ndcg should be changed to ndcg instead of the current map. With the truncation mentioned in above, the default metric should be ndcg@truncation.

Change implementation details

Right now the implementation deviates from standard algorithm described in related literature. I want to revise the implementation to get the code closer to descriptions in well known literature. I have a prototype of revised rank:ndcg running on CPU, which has better accuracy than existing one even when truncation is set to 1. Also, there are some optimizations we can perform like caching IDCG values for each query group.

trivialfis commented 3 years ago

@hcho3 @RAMitchell @CodingCat @ShvetsKS