Inconsistent behavior of median regression (objective = "quantile", alpha = 0.5) with early stopping.

Hi there,

i'm experiencing unexpected behavior of lightgbm model trainings on a rather small dataset with about 4100 rows. I'm using earlyStopping with an validation set with 310 rows. It was generated by stratified (with respect to the label y) sampling from the data. I also played a little with the other parameters (like learning_rate etc) but that didnt really help.

The model training will stop after iteration 1 and learn nothing for (objective = "quantile", alpha = 0.5) but will produce meaningful results for the following parameters:

(objective = "regression_l1", alpha = NULL)
(objective = "quantile", alpha = 0.4999)
(objective = "quantile", alpha = 0.5001)

Implications for me: I avoid using lightgbm with quantile regression in fully automated processes and try to use the more robust "regression_l2" objective if the application context permits.

Here is my code for a minimal working example. Unfortunately i cannot share the data publicly but i think i can provide it to the devs if that's helpful.

R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
lightgbm_3.3.1 (CRAN version)

rm(list = ls())
library(data.table)
library(ggplot2)
library(plotly)
library(dygraphs)

lgbmExampleNoMedianTrained <- readRDS("./exampleData/lgbmExampleNoMedianTrained.RDS")
attach(lgbmExampleNoMedianTrained)

parameters <- list(learning_rate = 0.0372,
                   num_leaves = 14,
                   max_depth = 4, 
                   min_data_in_leaf = 0L,
                   feature_fraction = 1,
                   bagging_fraction = 1,
                   num_threads = 1,
                   metric = "quantile",
                   alpha = 0.5)

dtrain <- lightgbm::lgb.Dataset(
  data = X[-validInds, ],
  label = y[-validInds],
  weight = rep(1, length(y) - length(validInds)),
  colnames = colnames(X),
  categorical_feature = c("hourMinuteInt", "typeDayInt")
)

dvalid <- lightgbm::lgb.Dataset(
  data = X[validInds, ],
  label = y[validInds],
  weight = rep(1, length(validInds)),
  colnames = colnames(X),
  categorical_feature = c("hourMinuteInt", "typeDayInt")
)

trainedModelQuantileQ50 <- lightgbm::lgb.train(
  params = parameters,
  data = dtrain,
  nrounds = 1000L,
  early_stopping_rounds = 50L,
  valids = list("eval" = dvalid),
  obj = "quantile",
  verbose = 2
)

parameters$alpha <- 0.4999
trainedModelQuantileQ4999 <- lightgbm::lgb.train(
  params = parameters,
  data = dtrain,
  nrounds = 1000L,
  early_stopping_rounds = 50L,
  valids = list("eval" = dvalid),
  obj = "quantile",
  verbose = 2
)

parameters$alpha <- 0.5001
trainedModelQuantileQ5001 <- lightgbm::lgb.train(
  params = parameters,
  data = dtrain,
  nrounds = 1000L,
  early_stopping_rounds = 50L,
  valids = list("eval" = dvalid),
  obj = "quantile",
  verbose = 2
)

trainedModelRegressionL1 <- lightgbm::lgb.train(
  params = parameters,
  data = dtrain,
  nrounds = 1000L,
  early_stopping_rounds = 50L,
  valids = list("eval" = dvalid),
  obj = "regression_l1",
  verbose = 2
)

trainedModelQuantileQ50[["best_iter"]] # 1
trainedModelQuantileQ4999[["best_iter"]] # 364
trainedModelQuantileQ5001[["best_iter"]] # 295
trainedModelRegressionL1[["best_iter"]] # 260

cran / lightgbm

Inconsistent behavior of median regression (objective = "quantile", alpha = 0.5) with early stopping. #1