Toggling with honesty and tune.parameter in regression_forest function

Results from toying with texas dataset:

honesty seems to be the key parameter that moves around coefficient estimates in the texas results. When I compare estimates from tuning all parameters (sample.fraction, mtry, min.node.size, honesty.fraction, honesty.prune.leaves, alpha, imbalance.penalty), the estimates are not much different from when I tune just the honesty.fraction and honesty.prune.leaves parameters
Using the tune_regression_forest function, I find that consistently, the best honesty.fraction to use is 0.7.
Using tune.parameters, even when we're just tuning the honesty* parameters, more than quadruples the time it takes to run dml with just dml_n = 5. Keeping honesty = TRUE or honesty.fraction = .7 cuts down runtime a bit compared to honesty = FALSE.

So in the texas case specifically, it seems we should set honesty.fraction to 0.7 and keep tuning.parameters off to increase the speed of the function.

For users in general, the best approach should be if you have a small dataset, to figure out what honesty.fraction is best by running tune_regression_forest, and perhaps keepng tune.parameters turned off if speed is a concern.

yixinsun1216 / crossfit

Toggling with honesty and tune.parameter in regression_forest function #1