microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.73k stars 3.84k forks source link

Can sample codes of the paper be provided in github? #3033

Closed Auzout closed 4 years ago

Auzout commented 4 years ago

Summary

I'm reading LightGBM A Highly Efficient Gradient Boosting Decision Tree and confusing about GOSS. I tested it using Allstate dataset for regression task finally found very poor accuracy when more then 10% samples dropped. I want to recurrent the experiment in paper, but don't understand how to use Allstate dataset to a classification task.

Motivation

The paper is an importance window to access the inside of project, hoping both to be more perfect.

Description

References

From the table 1: Datasets used in the experiments, we can see Allstate dataset used for Binary classification.

Later the paper says "we also tuned the parameters for all datasets towards a better balancing between speed and accuracy. We set a = 0.05; b = 0.05 for Allstate," a=0.05,b=0.05 will lost 90% of data, when test regression task, this will make loss several order of magnitudes larger.

github-actions[bot] commented 1 year ago

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.