tidymodels / parsnip

A tidy unified interface to models
https://parsnip.tidymodels.org
Other
597 stars 89 forks source link

coercing rpart model fit with parsnip to partykit fails #274

Closed juliasilge closed 4 years ago

juliasilge commented 4 years ago

A model fit directly using rpart can be coerced to partykit, but the same model fit by parsnip cannot be coerced, even though they have the same class.

library(parsnip)
library(rpart)
library(partykit)
#> Loading required package: grid
#> Loading required package: libcoin
#> Loading required package: mvtnorm

tree_spec <- 
  decision_tree(
    cost_complexity = 1e-10,
    tree_depth = 4
  ) %>% 
  set_engine("rpart") %>% 
  set_mode("classification")

rpart_fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis)
parsnip_fit <- tree_spec %>%
  fit(Kyphosis ~ Age + Number + Start, data = kyphosis)

class(rpart_fit)
#> [1] "rpart"
class(parsnip_fit$fit)
#> [1] "rpart"

as.party(rpart_fit)
#> 
#> Model formula:
#> Kyphosis ~ Age + Number + Start
#> 
#> Fitted party:
#> [1] root
#> |   [2] Start >= 8.5
#> |   |   [3] Start >= 14.5: absent (n = 29, err = 0.0%)
#> |   |   [4] Start < 14.5
#> |   |   |   [5] Age < 55: absent (n = 12, err = 0.0%)
#> |   |   |   [6] Age >= 55
#> |   |   |   |   [7] Age >= 111: absent (n = 14, err = 14.3%)
#> |   |   |   |   [8] Age < 111: present (n = 7, err = 42.9%)
#> |   [9] Start < 8.5: present (n = 19, err = 42.1%)
#> 
#> Number of inner nodes:    4
#> Number of terminal nodes: 5
as.party(parsnip_fit$fit)
#> Error in eval(predvars, data, env): invalid 'envir' argument of type 'closure'

Created on 2020-03-26 by the reprex package (v0.3.0)

This is related to how partykit uses the formula of the original object it is coercing. Can we repair the call formula?

juliasilge commented 4 years ago

This is now closed in #316.

library(parsnip)
library(rpart)
library(partykit)
#> Loading required package: grid
#> Loading required package: libcoin
#> Loading required package: mvtnorm

tree_spec <- 
  decision_tree(
    cost_complexity = 1e-10,
    tree_depth = 4
  ) %>% 
  set_engine("rpart") %>% 
  set_mode("classification")

rpart_fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis)

parsnip_fit <- tree_spec %>%
  fit(Kyphosis ~ Age + Number + Start, data = kyphosis)

as.party(repair_call(parsnip_fit, kyphosis)$fit)
#> 
#> Model formula:
#> Kyphosis ~ Age + Number + Start
#> 
#> Fitted party:
#> [1] root
#> |   [2] Start >= 8.5
#> |   |   [3] Start >= 14.5: absent (n = 29, err = 0.0%)
#> |   |   [4] Start < 14.5
#> |   |   |   [5] Age < 55: absent (n = 12, err = 0.0%)
#> |   |   |   [6] Age >= 55
#> |   |   |   |   [7] Age >= 111: absent (n = 14, err = 14.3%)
#> |   |   |   |   [8] Age < 111: present (n = 7, err = 42.9%)
#> |   [9] Start < 8.5: present (n = 19, err = 42.1%)
#> 
#> Number of inner nodes:    4
#> Number of terminal nodes: 5

Created on 2020-05-28 by the reprex package (v0.3.0)

I'm going to open an issue on tidymodels/tidymodels.org to add the partykit plot back to one of the decision tree articles.

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.