Closed TalWac closed 3 years ago
Thanks for using {lightgbm}
!
Where does the function fit.lightgbm()
come from? There is no such functiom in {lightgbm}
. Right now, your example code does not seem to contain any {lightgbm}
code.
[LightGBM] [Fatal] Cannot change max_bin after constructed Dataset handle.
This problem usually occurs when:
First, a Dataset is created, and used for training once.
Then if the user uses the same Dataset to train for the second time, and specify a different max_bin
value from the first training, the error will be reported.
This is because, before Dataset is used for training, the feature values will be discretized into at most max_bin
bins. And currently, we don't support discretize the Dataset for twice with different max_bin
values.
To avoid this, just recreate the Dataset each time before your training.
@jameslamb - Sorry for being unclear.
The fit.lightgbm()
comes from the Repit (This is were all the code come from).
This how fit.lightgbm()
looks like:
> fit.lightgbm
function (training, testing)
{
train <- as.matrix(training)
test <- as.matrix(testing)
coltrain <- ncol(train)
coltest <- ncol(test)
dtrain <- lightgbm::lgb.Dataset(train[, 2:coltrain], label = train[, 1])
lightgbm::lgb.Dataset.construct(dtrain)
dtest <- lightgbm::lgb.Dataset.create.valid(dtrain, test[,2:coltest], label = test[, 1])
valids <- list(test = dtest)
params <- list(objective = "regression", metric = "rmse")
modelcv <- lightgbm::lgb.cv(params, dtrain, nrounds = 5000,
nfold = 10, valids, verbose = 1, early_stopping_rounds = 1000,
record = TRUE, eval_freq = 1L, stratified = TRUE, max_depth = 4,
max_leaf = 20, max_bin = 50)
best.iter <- modelcv$best_iter
params <- list(objective = "regression_l2", metric = "rmse")
model <- lightgbm::lgb.train(params, dtrain, nrounds = best.iter,
valids, verbose = 0, early_stopping_rounds = 1000, record = TRUE,
eval_freq = 1L, max_depth = 4, max_leaf = 20, max_bin = 50)
print(paste0("End training"))
return(model)
}
If I change the max_bin = 50
in the modelcv
to max_bin = 255
or remove it at all the function lightgbm::lgb.cv
does work.
Otherwise, there is the error mentioned.
@shiyu1994 - Thank you for your time
Then if the user uses the same Dataset to train for the second time, and specify a different max_bin value from the first training, the error will be reported.
I do not think this is the case, since the fit.lightgbm
function looks like that:
> fit.lightgbm
function (training, testing)
{
train <- as.matrix(training)
test <- as.matrix(testing)
coltrain <- ncol(train)
coltest <- ncol(test)
dtrain <- lightgbm::lgb.Dataset(train[, 2:coltrain], label = train[, 1])
lightgbm::lgb.Dataset.construct(dtrain)
dtest <- lightgbm::lgb.Dataset.create.valid(dtrain, test[,2:coltest], label = test[, 1])
valids <- list(test = dtest)
params <- list(objective = "regression", metric = "rmse")
modelcv <- lightgbm::lgb.cv(params, dtrain, nrounds = 5000,
nfold = 10, valids, verbose = 1, early_stopping_rounds = 1000,
record = TRUE, eval_freq = 1L, stratified = TRUE, max_depth = 4,
max_leaf = 20, max_bin = 50)
best.iter <- modelcv$best_iter
params <- list(objective = "regression_l2", metric = "rmse")
model <- lightgbm::lgb.train(params, dtrain, nrounds = best.iter,
valids, verbose = 0, early_stopping_rounds = 1000, record = TRUE,
eval_freq = 1L, max_depth = 4, max_leaf = 20, max_bin = 50)
print(paste0("End training"))
return(model)
}
And the max_bin=50
is constant if I understand it correctly.
I see, that was the context we needed. Thank you. I think that if you pass max_bin
as parameter to lgb.Dataset()
, instead of lgb.train()
and lgb.cv()
, your code will run successfully. @shiyu1994 's explanation above explains why. Let us know if you have additional questions.
@jameslamb and @shiyu1994 Thank you for the clear explanations. Following @jameslamb's last comment now it works.
Many Thanks!!
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.
Description
When trying to create a model I get this error:
Reproducible example
I run the script from Repit
When running this line, I receive the error above:
lightgbm <- fit.lightgbm(training,testing)
However, if I remove the
max_bin = 50
inside the functionlightgbm::lgb.cv
(that inside the functionfit.lightgbm
) or change it tomax_bin = 255
, then there is no errro.Environment info
LightGBM version or commit hash: version 3.1.1
Command(s) you used to install LightGBM
Additional Comments
Many Thanks!