martin-borkovec / ggparty

147 stars 14 forks source link

How to show the count of positive case in the terminal node? #39

Closed woodysung closed 4 years ago

woodysung commented 4 years ago

The vignette has example to show the node size(thru the geom_node_label() ) at the terminal node. How about if I want to show the positive case at the terminal node(suppose I use rpart() to fit a binary classification tree)?

Example: "n=1000 , Y=850"

martin-borkovec commented 4 years ago

you can use the ggaprty() argument add_vars = list(n_positive = function(data, node) sum(node$data$play == "yes")) to create a variable with the inforamation you want.

remember, if you want to be able to access it via the gglist argument of geom_node_plot(), you would have to add the prefix "nodedata_" and call it something like nodedata_n_positive in the add_vars list. However once you want to access it in the gglist, you can do it without the prefix.

library("ggparty")
#> Loading required package: ggplot2
#> Loading required package: partykit
#> Loading required package: grid
#> Loading required package: libcoin
#> Loading required package: mvtnorm

data("WeatherPlay", package = "partykit")
sp_o <- partysplit(1L, index = 1:3)
n1 <- partynode(id = 1L, split = sp_o, kids = lapply(2L:4L, partynode))
t2 <- party(n1,
            data = WeatherPlay,
            fitted = data.frame(
              "(fitted)" = fitted_node(n1, data = WeatherPlay),
              "(response)" = WeatherPlay$play,
              check.names = FALSE),
            terms = terms(play ~ ., data = WeatherPlay)
)
t2 <- as.constparty(t2)

ggparty(t2, add_vars = list(n_positive =  function(data, node) sum(node$data$play == "yes"))) +
  geom_edge() +
  geom_edge_label() +
  geom_node_splitvar() +
  geom_node_label(aes(label = paste0("n=", nodesize,", Y=", n_positive)),
                  ids = "terminal",
                  nudge_y =  0.03)+
  geom_node_plot(gglist = list(geom_bar(aes(x = "", fill = play),
                                        position = position_fill()),
                               xlab("play")))

Created on 2020-03-13 by the reprex package (v0.3.0)

woodysung commented 4 years ago

Thank you very much !