Open SimonDedman opened 6 years ago
See this: https://www.r-bloggers.com/error-handling-in-r/ Currently didn't know how to continue when gbm.loop/auto/step crashes. Can use this to fail fast: try quickest highLR values to lowerLR values (divide by 10 step from 0.1?) until it runs, which constrains the possibility space . Would need to get a feel for how BF & LR (&TC?) combine in practice to produce final CV score. If they're hierarchical (LR>BF>TC) then could optimise in order, setting possibility space backwards. TC defined by n.expvars, BF by bfcheck. Optimise CV on LR alone, then optimise BF, then TC.
https://stat.ethz.ch/R-manual/R-devel/library/base/html/options.html options(error) calls stop, could potentially save getwd() at the start of loop/auto/step then have options(error) return: setwd(initialwd) && stop
Could parallelise param combinations and run loads of gbm.autos in a foreach loop (see https://github.com/SimonDedman/gbm.auto/issues/21) then compare processing time and CV score, also giving the option to check for absence of... report.csv? If the gbm.auto run fails then report.csv will be absent.
Probably there's a relationship between sample size (/positive sample size) & variance, and optimal bfs & lrs. Certainly gbm.bfcheck gives you a range of bfs. If I can find this that probably obviates much (all?) of the work of optimising.
Could do this using BRTs!! What's the influence & relationship shape of tc lr bf on td score & how do they interact? 3D surface output potentially?
What's the relationship between gbm.bfcheck results and what will actually run?
Optimise section as a wrapper around gbm.step for bin, and separately for gaus. Once it's optimised and run, the values will already be saved in the report csv. Can then re-use those as list(bin,gaus) params if re-running in future. For optimising, will ideally use largest LR, smallest BF >=0.5, and TC related to nvars.
https://www.tidymodels.org/learn/work/tune-svm/ could do this within the tidymodels framework. Could conceptually rewrite the entirety of gbm.auto within that framework... See also https://dials.tidymodels.org/articles/Basics.html https://tune.tidymodels.org/articles/getting_started.html
this already solves in python: https://scikit-optimize.github.io/stable/auto_examples/sklearn-gridsearchcv-replacement.html https://scikit-optimize.github.io/stable/ see Hulbert.etal.2020.Exponential build seismic energy Cascadia.pdf :
We rely on the XGBoost library for the gradient boosted trees’ regression, shown in Fig. 2 of the paper (and for results presented below). The problem is posed in a regression setting. Model hyperparameters are set by five-fold cross-validation, using Bayesian optimization (skopt library)
Anything comparable in R? https://softwarerecs.stackexchange.com/questions/25728/scikit-learn-for-r Could do in reticulate https://rstudio.github.io/reticulate/ ?
https://www.rdocumentation.org/packages/mlr/versions/2.19.0/topics/tuneParamsMultiCrit possible final solution
see gbm.auto.extras folder, tryCatchTest.R
Done? See Erik Franklin's code.
Trial & error iterative approach to determine the optimal lr for a data set? How? Stop @ whole percentages? Try OPTIM function & see http://r.789695.n4.nabble.com/Optimization-in-R-similar-to-MS-Excel-Solver-td4646759.html Possibly have an option to start with this in the R function. Separate function? Maybe do as separate function then feed into this so the outputs are jkl? or make one uber function but can use all 3 separately. Uberfunction doesn't need the loops? optim: use Method "L-BFGS-B" require(optimx) see: https://stats.stackexchange.com/questions/103495/how-to-find-optimal-values-for-the-tuning-parameters-in-boosting-trees/105653#105653 The caret package in R is tailor made for this. Its train function takes a grid of parameter values and evaluates the performance using various flavors of cross-validation or the bootstrap. The package author has written a book, Applied predictive modeling, which is highly recommended. 5 repeats of 10-fold cross-validation is used throughout the book. For choosing the tree depth, I would first go for subject matter knowledge about the problem, i.e. if you do not expect any interactions - restrict the depth to 1 or go for a flexible parametric model (which is much easier to understand and interpret). That being said, I often find myself tuning the tree depth as subject matter knowledge is often very limited. I think the gbm package tunes the number of trees for fixed values of the tree depth and shrinkage. https://www.youtube.com/watch?v=7Jbb2ItbTC4 see gbm.fixed in BRT_ALL.R - having processed the optimal BRT, might as well just use those details going forward rather than re-running the best one again.