Closed hiyamgh closed 4 years ago
It's true that n_estimators
in the code corresponds to M in the paper, as you have written.
Moreover it's also true that in our experiments we set M by picking the best val_nll
on a validation set, and concretely this is assigned to the variable best_itr
. This was designed to match the methodology of prior works. However, I want to make it clear that this methodology is just one possible option among many ways you can choose the hyper-parameter M in real-world applications. One alternative is to fix M to be a randomly chosen large value. Another alternative is, as you mentioned, K-fold cross validation. So the answer is yes, n_estimators
can be chosen via K-fold cross validation, and it does not need to follow the code in this experiment.
Feel free to re-open the issue if anything is left unclear.
Hello,
Can someone please explain the difference between the parameter
n_estimators
and the number of Boosting Iterations (referred to in the paper asM
) that you were trying to tune in the experiments hereIn the documentation:
and with regards to
M
its defined in the paper asthe number of boosting stages
, basically, I am looking at the following line of code:I assume the
M
in this case is thebest_itr
variable in the code above. If I am interested in applying kfold cross validation, is thebest_itr
a parameter that I can tune through kfold cross validation for example?