fabsig / KTBoost

A Python package which implements several boosting algorithms with different combinations of base learners, optimization algorithms, and loss functions.
Other
63 stars 19 forks source link

sample_weight is being multiplied twice - Tobit Loss #6

Closed sanketrdeshmukh closed 4 years ago

sanketrdeshmukh commented 4 years ago

In the negative gradient of the tobit loss function, residual accounts for the sample_weight for each observation. In addition to that, sample weight is being accounted for again in the leaf update step. There might double accounting for sample weights for the tobit update step.

Whereas all the other loss functions account for sample weight only in the leaf update step.

fabsig commented 4 years ago

Many thanks for your comment, sanketrdeshmukh!

Can you please refer to the exact locations in the code, this makes it easier for me to check this.

sanketrdeshmukh commented 4 years ago

The first time sample weights are used at: https://github.com/fabsig/KTBoost/blob/df79c8152f1b706d221d16526577f8acc4ca1e84/KTBoost/KTBoost.py#L748

and then second time here at https://github.com/fabsig/KTBoost/blob/df79c8152f1b706d221d16526577f8acc4ca1e84/KTBoost/KTBoost.py#L811

fabsig commented 4 years ago

Yes, you are right. I have corrected this.

Thanks for pointing this out!