stan-dev / projpred

Projection predictive variable selection
https://mc-stan.org/projpred/
Other
110 stars 26 forks source link

Reduce peak memory usage during performance evaluation #450

Closed fweber144 closed 11 months ago

fweber144 commented 11 months ago

Closes #440 by implementing the suggestion from https://github.com/stan-dev/projpred/issues/440#issuecomment-1683373758. I can confirm that the reprex from https://github.com/stan-dev/projpred/issues/440#issue-1852823375 now runs through on my machine (the same machine that I was using originally for that reprex) and only errors at the end when the projection onto the full model takes place (which is probably due to #323; the error message is Error in if (any(edgevals <- 0 < bdiff & bdiff < boundary.tol)) { :\nmissing value where TRUE/FALSE needed). Running the reprex with nterms_max = 3 and peakRAM::peakRAM() around the varsel() expression (twice; takes long enough) confirms the reduction of peak memory usage:

# Branch `master`:
peak_old <- replicate(2, peakRAM(eval(vs_expr)), simplify = FALSE)
quantile(sapply(peak_old, "[[", "Peak_RAM_Used_MiB"))
#       0%      25%      50%      75%     100%
# 10659.80 10668.08 10676.35 10684.62 10692.90

# Branch `merge_getters` (this PR):
peak_new <- replicate(2, peakRAM(eval(vs_expr)), simplify = FALSE)
quantile(sapply(peak_new, "[[", "Peak_RAM_Used_MiB"))
#       0%      25%      50%      75%     100%
# 5297.800 5304.425 5311.050 5317.675 5324.300