microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.72k stars 3.84k forks source link

LGBMRegressor with linear_tree has constant leafs #5131

Closed Pablo-Davila closed 2 years ago

Pablo-Davila commented 2 years ago

Description

I am trying to build a gradient boosting tree with linear models in its leafs. I am using the parameter linear_tree for this purpose.

To check wether the leafs are actually linear I have built a sample dataset and a regression model with a single estimator with the code below. The results after running the script suggest that the leafs are constant values:

meh

Am I missing something?

Reproducible example

from lightgbm import LGBMRegressor
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import numpy as np

# Sample "dataset"
x = np.arange(0,30,0.1).reshape(-1,1)
y = np.sin(x).reshape(-1)

data = np.stack([
    y[i:i+6]
    for i in range(y.shape[0] - 6)
])
x_train, x_test, y_train, y_test = train_test_split(
    data[:, :-1],
    data[:, -1],
    test_size=0.25,
    shuffle=False
)

# Model
params = {
    'objective': 'regression',
    'learning_rate': 1,
    'n_estimators': 1,
    'max_depth': 2,
    'linear_tree': True,
}
model = LGBMRegressor(**params)

model.fit(x_train, y_train)

# Plotting
plt.plot(x,y)
plt.plot(x[5:x_train.shape[0]+5].reshape(-1), model.predict(x_train))
plt.plot(x[-x_test.shape[0]:].reshape(-1), model.predict(x_test))
plt.legend(["real", "prediction (train)", "prediction (test)"])
plt.show()

Environment info

LightGBM version or commit hash: v3.3.2

jmoralez commented 2 years ago

Hi @Pablo-Davila, thank you for your interest in LightGBM. For the linear tree learner the first iteration only sets constant leaf values: https://github.com/microsoft/LightGBM/blob/6d825cd3a1ac6f886654f604e824f7823a259cb5/src/treelearner/linear_tree_learner.cpp#L177-L182 Adjusting your example to have 'n_estimators': 2 gives the expected output.

image

Please let us know if this helps.

Pablo-Davila commented 2 years ago

Hi @jmoralez, thank you very much for your quick and accurate response. This definitely solves my issue.

github-actions[bot] commented 1 year ago

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.