dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.3k stars 8.73k forks source link

Rank:pairwise instance weight not supported? #2460

Closed shengyu-guo closed 6 years ago

shengyu-guo commented 7 years ago

Hi,

I am trying to use lambdamart for pairwise rank.

Refer to here:https://github.com/dmlc/xgboost/blob/master/doc/input_format.md#instance-weight-file. I can give the instance weight in xxx.weight file

After I set the weight for some instance, I found the model, however, still look the same compared to the model trained by all instance weighted the same.

I check out the code in rank_obj.cc and I do not find the weight is used to construct or set the pair.

So...Could you tell me the if the instance weight is supported for LambdaMart?

Thanks!

Attach the code I've checked:

// get lambda weight for the pairs
        this->GetLambdaWeight(lst, &pairs);
        // rescale each gradient and hessian so that the lst have constant weighted
        float scale = 1.0f / param_.num_pairsample;
        if (param_.fix_list_weight != 0.0f) {
          scale *= param_.fix_list_weight / (gptr[k + 1] - gptr[k]);
        }
        for (size_t i = 0; i < pairs.size(); ++i) {
          const ListEntry &pos = lst[pairs[i].pos_index];
          const ListEntry &neg = lst[pairs[i].neg_index];
          const bst_float w = pairs[i].weight * scale;
          const float eps = 1e-16f;
          bst_float p = common::Sigmoid(pos.pred - neg.pred);
          bst_float g = p - 1.0f;
          bst_float h = std::max(p * (1.0f - p), eps);
          // accumulate gradient and hessian in both pid, and nid
          gpair[pos.rindex].grad += g * w;
          gpair[pos.rindex].hess += 2.0f * w * h;
          gpair[neg.rindex].grad -= g * w;
          gpair[neg.rindex].hess += 2.0f * w * h;
}
...
class PairwiseRankObj: public LambdaRankObj{
 protected:
  void GetLambdaWeight(const std::vector<ListEntry> &sorted_list,
                       std::vector<LambdaPair> *io_pairs) override {}
};
ngoyal2707 commented 6 years ago

Thanks @shengyu-guo for reporting this issue, https://github.com/dmlc/xgboost/pull/3379 PR should solve it, which is now merged to master.