grf-labs / grf

Generalized Random Forests
https://grf-labs.github.io/grf/
GNU General Public License v3.0
938 stars 250 forks source link

Problems with grf in R Notebooks #1374

Closed MCKnaus closed 7 months ago

MCKnaus commented 7 months ago

Description of the bug I have opened an issue R Notebooks very slow with newer RStudio versions for the RStudio developers that is specifically motivated by the regression_forest() and causal_forest() functions. Running them within the R Notebook environment tremendously slows down RStudio. Curiously it seems to be an interaction between recent RStudio versions, R Notebooks and grf. No problems for example in 2022 RStudio versions, in the console, or with ranger functions. Maybe you have an idea what is specific about regression_forest() causing these issues and can give a hint towards the RStudio colleagues or find a way to fix it within the package.

Steps to reproduce Create a blank R Notebook and include the following code chunk:

library(grf)
n <- 5000
p <- 10
X <- matrix(rnorm(n * p), n, p)
Y <- X[, 1] * rnorm(n)
r.forest <- regression_forest(X, Y)

# Predict using the forest.
X.test <- matrix(0, 101, p)
X.test[, 1] <- seq(-2, 2, length.out = 101)
r.pred <- predict(r.forest, X.test)

Run the code chunk either by pushing the "play button" at the top right of the code chunk or "Run All". Importantly running the same code in the console or knitting the notebook does not produce the problem.

GRF version 2.3.1

erikcs commented 7 months ago

Thanks @MCKnaus, it seems the good folks over at RStudio figured this out.

MCKnaus commented 7 months ago

Indeed they did. They fixed it in this daily release https://dailies.rstudio.com/version/2023.12.0-daily+336/ in case somebody has the same issues. Sorry for spamming your issues.

erikcs commented 7 months ago

Thank you for helping with getting grf running properly in R Notebooks!