Closed Edmondguo closed 3 years ago
Hi @Edmondguo . I can just point to some historic discussion and my understanding of how xgboost works, so it would still be good to get some official confirmation for that, e.g. from @hcho3 .
I believe rank:pairwise is a pairwise method that tries to minimize the number of pairwise errors. rank:ndcg is a method following LambdaMART and when you dig in the code - it will confirm that rank:ndcg is an extension of rank:pairwise with additional weights added to the loss of each pair.
however in a few experiments it looks as if rank:ndcg performs worse than rank:pairwise, and it might be due to the implementation. see e.g. https://github.com/dmlc/xgboost/issues/2092#issuecomment-286819394
Some time ago we verified rank:ndcg to perform a bit worse when evaluated on ndcg than rank:pairwise in our case.
rank:ndcg is an extension of rank:pairwise with additional weights added to the loss of each pair.
Exactly. In "From RankNet to LambdaRank to LambdaMART", LambdaMART optimizes NDCG by optimizing the pairwise loss (with lambda's) that is weighted with change in NDCG.
Hi @Edmondguo . I can just point to some historic discussion and my understanding of how xgboost works, so it would still be good to get some official confirmation for that, e.g. from @hcho3 .
I believe rank:pairwise is a pairwise method that tries to minimize the number of pairwise errors. rank:ndcg is a method following LambdaMART and when you dig in the code - it will confirm that rank:ndcg is an extension of rank:pairwise with additional weights added to the loss of each pair.
however in a few experiments it looks as if rank:ndcg performs worse than rank:pairwise, and it might be due to the implementation. see e.g. #2092 (comment)
Some time ago we verified rank:ndcg to perform a bit worse when evaluated on ndcg than rank:pairwise in our case.
Thank you very much! In my experiment I also found that rank:ndcg perform worse than rank:pairwise.
rank:ndcg is an extension of rank:pairwise with additional weights added to the loss of each pair.
Exactly. In "From RankNet to LambdaRank to LambdaMART", LambdaMART optimizes NDCG by optimizing the pairwise loss (with lambda's) that is weighted with change in NDCG.
Thank you! So is it means in rank:pairwise, xgboost use lambda's which is derived by "Cross Entropy Loss" in RankNet as the loss funtion?
@Edmondguo Yes
@Edmondguo @kretes Would you be interested in posting an example where you get better NDCG metric by choosing rank:pairwise
instead of rank:ndcg
? I'd like to see if this is a bug or a chance.
@Edmondguo @kretes Would you be interested in posting an example where you get better NDCG metric by choosing
rank:pairwise
instead ofrank:ndcg
? I'd like to see if this is a bug or a chance.
The project I am dealing with is using rank model in quantitative stock selection.It seems hard to provide because the data is too big.In this case "rank:pairwise" performs much better than "rank:ndcg" under the same booster parameters.the NDCG are 0.5138 for "rank:ndcg", 0.5586 for "rank:pairwise".
@Edmondguo Does your data have multiple relevance judgment levels (1, 2, 3, 4, ...) ?
@Edmondguo Does your data have multiple relevance judgment levels (1, 2, 3, 4, ...) ?
Yes,before I train the model,I have change y into (1,2,3,...30)
It would be nice if there is a toy example we can use to show rank:pairwise
outperforming rank:ndcg
. Without an example, it is hard to find out why rank:ndcg
is not working well.
Hello.
I believe I found an example where this is reproducible. rank-pairwise gives ndcg 1 while rank:ndcg cannot. See this gist: https://gist.github.com/kretes/1228e571aeba2a57f617352af633cd40.
I hope this will help nailing the issue
i met this problem too, i could't find the reason to explain it, "objective = rank:pairwise" better than "objective = rank:ndcg"
@Edmondguo just want to follow up this issue. I met the same problem. Did you figure out the reason?
Some explanation is given in https://github.com/dmlc/xgboost/issues/6352 . For future work, see https://github.com/dmlc/xgboost/issues/6450 .
Thanks for adding ranking task support in xgboost! But I have a few questions: