microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.52k stars 3.82k forks source link

Introduce Pairwise Ranking/Scoring in LambdaMART #6147

Open metpavel opened 10 months ago

metpavel commented 10 months ago

Summary

Introduce an alternative for the ranking task (powered by LambdaMART algorithm) to leverage pairwise scoring function (instead of the standard pointwise one).

Motivation

We expect to increase the performance of ranking models trained with LightGBM in most cases.

Description

LambdaMART (the learning-to-rank algorithm implemented in LightGBM) is often referred to as pairwise due to its loss function defined over pairs of documents from the ranked list. However, this pairwise loss function leverages pointwise scoring function s(x) defined over feature vectors of individual documents. Here we propose to extend LambdaMART to use pairwise scoring function s(x_i, x_j) defined on any pair of documents whose feature vectors are x_i and x_j. The advantage of using pairwise scoring function comes when we let it, additionally, use the differential (delta) features: x_i - x_j. That is, we learn a pairwise scoring function s(x_i, x_j, x_i - x_j). The use of delta features can often help ranking documents accurately by a much smaller model: imagine we have a document recency feature which is highly correlated with relevance. To rank documents effectively by this feature with a pointwise scoring function s(x) would require building a regression tree/ensemble with many splits corresponding to as many levels of recency as possible. On the other hand, when we use delta features (i.e., delta recency in this case), having only one split could be enough to predict relative relevance for a pair of documents, and rank them accurately. As a result, we have smaller and more accurate models with better generalization properties.

References

metpavel commented 10 months ago

FYI, @shiyu1994, @jameslamb.

jameslamb commented 10 months ago

exciting! I've added this to #2302 along with other features.

I have 2 questions:

  1. Can this be done in a backwards-compatible way? e.g. by adding a new supported value of objective or some other configuration parameter, instead of changing the default approach that's used by e.g. objective = "lambdarank". Is that even worth doing?
  2. Are you planning to contribute this?
metpavel commented 10 months ago

Hi @jameslamb,

  1. Sure! We'll keep default behavior as it is objective = "lambdarank". For the pairwise scoring, a user would be able to specify something like objective = "lambdarank_pw". Or, alternatively, instead of adding a new objective we could introduce a configuration parameter pairwise_score which is set to false by default.
  2. Yes, @shiyu1994 and I have already started working on it.
StrikerRUS commented 1 month ago

Linking #6182.