dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.29k stars 8.73k forks source link

Bug in `objective = "reg:pseudohubererror"` and `xgb.plot.tree()` #10988

Open DrJerryTAO opened 6 days ago

DrJerryTAO commented 6 days ago

Hi @mattn, I wanted to use XGBoost for quantile regression but found that the loss function of pseudo Huber error does no better than a null model. Currently, objective = 'reg:pseudohubererror' predicts every case as 0.5, with no information learnt at all no matter how other parameters are specified.

Also, xgb.plot.tree() shows nothing. The Viewer panel is blank.

Further, objective = "reg:quantileerror" results in error although the online documentation mentions it https://xgboost.readthedocs.io/en/latest/parameter.html. I am using the latest R version 1.7.8.1.

library(xgboost)
library(tidyverse)
data(mtcars)
Data <- mtcars %>%
  {xgb.DMatrix(
    data = (.) %>% select(-mpg) %>% as.matrix(), 
    label = (.) %>% pull(mpg))}
Model <- xgboost(
  data = Data, 
  objective = "reg:pseudohubererror", 
  max.depth = 3, eta = 1, nrounds = 100)
"As the log shows, each mean pseudo Hubber error is 18.618537, no changes 
over iteration"
Model <- xgboost(
  data = Data, 
  objective = "reg:pseudohubererror", eval_metric = "mae", 
  max.depth = 3, eta = 1, nrounds = 100)
"mae = 19.590625, no changes over 100 iteration"
mean(abs(mtcars$mpg - 0.5)) # 19.59062
"objective = 'reg:pseudohubererror' predicts every case as 0.5, 
no information learnt at all."
Model <- xgboost(
  data = Data, 
  objective = "reg:quantileerror", eval_metric = "mae", 
  max.depth = 3, eta = 1, nrounds = 100)
"Error in xgb.iter.update(bst$handle, dtrain, iteration - 1, obj) : 
  [02:01:16] src/objective/objective.cc:26: 
  Unknown objective function: `reg:quantileerror`
Objective candidate: survival:aft
Objective candidate: binary:hinge
Objective candidate: rank:pairwise
Objective candidate: rank:ndcg
Objective candidate: rank:map
Objective candidate: multi:softmax
Objective candidate: multi:softprob
Objective candidate: reg:squarederror
Objective candidate: reg:squaredlogerror
Objective candidate: reg:logistic
Objective candidate: binary:logistic
Objective candidate: binary:logitraw
Objective candidate: reg:linear
Objective candidate: reg:pseudohubererror
Objective candidate: count:poisson
Objective candidate: survival:cox
Objective candidate: reg:gamma
Objective candidate: reg:tweedie
Objective candidate: reg:absoluteerror"
Model <- xgboost(
  data = Data, 
  objective = "reg:tweedie", eval_metric = "mae", 
  max.depth = 3, eta = 1, nrounds = 4)
xgb.plot.tree(model = Model)
"The Viewer panel shows blank. This is not because my environment has errors."
xgb.plot.importance(importance_matrix = xgb.importance(model = Model))
"If I plot variable importance, I do see a plot in Plots."
hcho3 commented 6 days ago

The "reg:quantileerror" objective was added in XGBoost 2.0, which isn't available on CRAN. You should install the R package from the source to use the feature.

DrJerryTAO commented 6 days ago

@hcho3 thanks. Could you address xgb.plot.tree() and objective = 'reg:pseudohubererror' bugs? How do we install from the source? I did not see sample codes for R.