Closed benkeser closed 2 years ago
Not particularly much thought went into the default num_knot choices. But, the main reason I chose to not make it grow with sample size is computation time, so that the package is friendlier for new users. If you have any feedback/ideas on this, that will be greatly appreciated.
No, that makes sense from a new user perspective and is probably the right approach. Perhaps some rough guidelines could be added to the documentation? Or at least a warning that tuning this value may be beneficial to performance.
The choice does mean "power users" (which may be a group that only includes me) have to be a bit more careful when studying theoretical simulations with HAL, to be sure that we're letting the number of bins get larger. I often use HAL as a catch-all algorithm when I need a consistent nuisance estimator in simulations to verify theoretical properties about estimators. In a simulation study with a student we were getting strange large-sample results and traced the issue (we think) back to this binning choice.
In any case, will close the issue -- just wanted to confirm that indeed the # of bins was staying fixed as sample size increased. Thanks for the response!
Thanks for all the work on this package.
If I am reading the documentation correctly, the new default values of
num_knots
uses a fixed number of knots for specifying a basis irrespective of sample size. Is that correct? If so, is there a motivation for this?