ogrisel / pygbm

Experimental Gradient Boosting Machines in Python with numba.
MIT License
182 stars 32 forks source link

Investigate the discrepancy in default hyperparams compared to LightGBM #32

Open ogrisel opened 5 years ago

ogrisel commented 5 years ago

Possible culprits:

See details in https://github.com/ogrisel/pygbm/issues/30#issuecomment-435091127.

ogrisel commented 5 years ago

As @NicolasHug noted, our min_samples_leaf in pygbm is not correct. I would rather implement what LightGBM does, that is reject splits that would result in one of the child nodes having less than min_samples_leaf.

NicolasHug commented 5 years ago

You mean sklearn?

LightGBM is doing something very weird with min_sample_leaf, it looks like it is ignored because of num_leaves (see https://github.com/ogrisel/pygbm/issues/30#issuecomment-435138526)

guolinke commented 5 years ago

@NicolasHug I think you used the wrong parameter name in that code.

ogrisel commented 5 years ago

Indeed. It's actually the pygbm handling of min_samples_leaf that is broken. See: #34.