crflynn / skgrf

scikit-learn compatible Python bindings for grf (generalized random forests) C++ random forest library
https://skgrf.readthedocs.io/en/stable/
GNU General Public License v3.0
31 stars 7 forks source link

How should tuning be implemented? #18

Open crflynn opened 3 years ago

crflynn commented 3 years ago

GRF includes tuning facilities for many of the estimators. In particular, the following estimators have tuning parameter options:

In addition, some forests use tuning implicitly, and/or pass tuning parameters down into internal forests.

Scikit-learn also provides facilities for hyperparameter tuning under the model_selection module. This begs the question: When and where in skgrf should tuning be implemented, if at all?

  1. Make skgrf a true port of R-grf. This means implementing tuning exactly as it exists in the R lib, ignoring sklearn model selection, and hardcoding tuning in the same way.

  2. Ignore R-grf's tuning entirely, allowing users to utilize the model_selection module. This means however, that the implementations for Causal, Instrumental, and Boosted forests would be different than what exists in R.

  3. Selectively implement R-grf's tuning, in order to maintain parity with R-grf's implicit tuning. This is the current implementation.

  4. Refactor some of the estimators to allow more fine-grained control of tuning separate components, removing tuning from skgrf and allowing users to tune with model_selection objects.