bethatkinson / rpart

Recursive Partitioning and Regression Trees
43 stars 23 forks source link

Grow tree one split at a time #48

Open RoelVerbelen opened 1 year ago

RoelVerbelen commented 1 year ago

Is there any way to use rpart to grow and/or prune trees up to a certain number of splits?

Illustrated in the example below, the cptable skips certain number of splits (1 and 4). I'm not able to prune this tree to get exactly 1 split (either none or 2 splits).

I tried varying the control settings (cp = 0, xval = 0 and no minsplit or minbucket restrictions), but couldn't find a way for rpart to list all sub trees or build trees of a given number of splits.

library(rpart)
library(rpart.plot)

set.seed(123)

df <- data.frame(x = rnorm(100), y = rnorm(100))

tree <- rpart(formula = y ~ x, 
              data = df,
              control = rpart.control(
                maxdepth = 3,
                minsplit = 2,
                minbucket = 1,
                cp = 0,
                xval = 0
              ))

tree$cptable
#>            CP nsplit rel error
#> 1 0.074229442      0 1.0000000
#> 2 0.037867436      2 0.8515411
#> 3 0.015919779      3 0.8136737
#> 4 0.007656255      5 0.7818341
#> 5 0.000000000      6 0.7741779

rpart.plot(prune(tree, 0.074229443))


rpart.plot(prune(tree, 0.074229442))

Created on 2023-03-15 by the reprex package (v2.0.1)

Essentially same question as here on cross validated or here on stack overflow.