dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.32k stars 8.73k forks source link

How to use xgboost to do lambdamart listwise ranking? #901

Closed gusuperstar closed 8 years ago

gusuperstar commented 8 years ago

“rank:pairwise” –set XGBoost to do ranking task by minimizing the pairwise loss

do u mean this? Since lambdamart is a listwise approach, how can i fit it to listwise ranking? including commond, parameters, and training data format, and where can i set the lambda for lambdamart. could u give a brief demo or intro? many thanks!

gusuperstar commented 8 years ago

ok, i see. XGBoost supports accomplishing ranking tasks. In ranking scenario, data are often grouped and we need the group information file to s pecify ranking tasks. The model used in XGBoost for ranking is the LambdaRank, this function is not yet completed. Currently, we provide pairwise rank.

So, listwise learing is not supportted. Any plan?

tqchen commented 8 years ago

use rank:ndcg for lambda rank with ndcg metric

gdf0 commented 8 years ago

Hi, I just tried to use both objective = 'rank:map' and objective = 'rank:ndcg', but none of them seem to work. The pairwise objective function is actually fine. I can see in the code that the LambdaMART objective function is still there, however I do not understand why it cannot be selected using the python API. Thanks.

kapild commented 7 years ago

@tqchen can you comment if rank:ndcg or rank:map works for Python?

travisbrady commented 7 years ago

This needs clarification in the docs. Specifically:

  1. The FAQ says "Yes, xgboost implements LambdaMART. Checkout the objective section in parameters" yet the parameters page contains no mention of LambdaMART whatsoever.
  2. If LambdaMART does exist, there should be an example. I'm happy to submit a PR for this.
  3. This is maybe just an issue of mixing of terms, but I'd recommend that if Xgboost wants to advertise LambdaMART on the FAQ that the docs and code then use that term also.
Sandy4321 commented 6 years ago

is it resolved?

vatsan commented 6 years ago

FWIW, "rank:ndcg" is defined here https://github.com/dmlc/xgboost/blob/72cd1517d6b1d145c34e13a063fadd31b507b01d/src/objective/rank_obj.cc#L331

The docs needs to be updated.

hcho3 commented 6 years ago

@vatsan Looks like it was an oversight. Can you submit a pull request to update the parameter doc?

hcho3 commented 6 years ago

@vatsan @Sandy4321 @travisbrady I am adding all objectives to parameter doc: #3672