Closed fweber144 closed 11 months ago
At least GAMMs are probably affected by this issue as well, possibly also GAMs.
Another idea which might be less tedious and less error-prone than implementing a custom predict()
method (or reducing the size of the lme4 fits): It might be possible to solve this issue by combining get_sub_summaries()
and get_submodls()
into a single function (to avoid that the submodel fits from the increasingly complex submodels along the predictor ranking are stored at the same time). However, caution is needed with respect to get_submodls()
in project()
and in loo_varsel()
's validate_search = FALSE
case.
When performing a variable selection for a multilevel model with a large number of observations and a large number of projected draws, we can quickly run out of memory. The reason is probably that projpred stores the whole submodel fit (which, for a multilevel model, is an lme4 fit) for each projected draw. Each of these submodel fits might not be so large, but the collection of all submodel fits may require more memory than available. A solution might be to reduce the size of the lme4 fits (I don't know if this is possible without breaking downstream code, in particular, without breaking the
predict()
method for such fits) or to add a custompredict()
method for lme4 fits that requires less information (not the whole fit object). The latter idea might in fact be not that tedious to implement, considering that we havepredict.subfit()
andrepair_re()
.Illustration: On my machine (Linux) with 16 GB of RAM, the following reprex crashes R and eventually the whole machine (hence the CAUTION warning below and the wrapping in
if (FALSE)
):