gluc / data.tree

General Purpose Hierarchical Data Structure for R
http://gluc.github.io/data.tree
208 stars 41 forks source link

Extract rules from data.tree #126

Closed HeidiSeibold closed 5 years ago

HeidiSeibold commented 5 years ago

Is it possible to extract rules from a data.tree?

This is similar to this question on stackoverflow for partykit.

What I am looking for is a data.frame with variable rule as below (just created automatically instead of by hand).

library("partykit")
#> Loading required package: grid
#> Loading required package: libcoin
#> Loading required package: mvtnorm
library("data.tree")

### glmtree
# Pima Indians diabetes data
data("PimaIndiansDiabetes", package = "mlbench")

## recursive partitioning of a logistic regression model
pid_tree2 <- glmtree(diabetes ~ glucose | pregnant +
    pressure + triceps + insulin + mass + pedigree + age,
  data = PimaIndiansDiabetes, family = binomial)

dt_pid <- as.Node(pid_tree2)
res <- ToDataFrameTable(dt_pid, "name", "splitLevel", "splitname")
res$rule <- c("mass <= 26.3", "mass > 26.3 & age <= 30", "mass > 26.3 & age > 30")
res
#>   name splitLevel splitname                    rule
#> 1    2    <= 26.3      mass            mass <= 26.3
#> 2    4      <= 30       age mass > 26.3 & age <= 30
#> 3    5       > 30       age  mass > 26.3 & age > 30

I'd apprechiate any hints. Thanks :tulip:

gluc commented 5 years ago

Unfortunately, rule is not extracted automatically. This would be a change in https://github.com/gluc/data.tree/blob/master/R/node_conversion_party.R#L165

Though, to be honest, it's been a while since I had my last contact with partykit. But if you are familiar with it, then I'd surely welcome a pull request!

HeidiSeibold commented 5 years ago

Thanks @gluc for your response. So far partykit does not have a good way to extract rules so I thought maybe you have already thought about this. Maybe ggparty will give us a solution (https://github.com/mmostly-harmless/ggparty) eventually.

Do you have "rules" as concepts in generall in data.tree?

gluc commented 5 years ago

No, data.tree is really a generic data structure. I only added conversion from and to some other tree data structures to support usage. But beyond simple conversion of what's already available in partykit, there is no partykit related functionality.