martin-borkovec / ggparty

147 stars 14 forks source link

Plots are very small #4

Closed HeidiSeibold closed 5 years ago

HeidiSeibold commented 5 years ago

Hi @mmostly-harmless and @Niyazu,

I am just trying your package a bit. Looks already pretty good.

Is it possible to increase the size of the plots? E.g. could we increase the viewport a bit?

  library("ggparty")
#> Loading required package: ggplot2
#> Loading required package: partykit
#> Loading required package: grid
#> Loading required package: libcoin
#> Loading required package: mvtnorm

  airq <- subset(airquality, !is.na(Ozone))
  airct <- ctree(Ozone ~ ., data = airq)

  a <- ggparty(airct)
  b <- ggparty(airct, horizontal = TRUE)

  a +
    geom_edge() +
    geom_edge_label() +
    geom_node_splitvar() +
    geom_nodeplot(gglist = list(geom_density(aes(x = Ozone)),
                                scale_y_continuous(breaks = c(0, 0.04))),
                  ids = "terminal")
#> Warning: Removed 1 rows containing missing values (geom_segment).
#> Warning: Removed 1 rows containing missing values (geom_label).
#> Warning: Removed 5 rows containing missing values (geom_label).


  b +
    geom_edge() +
    geom_edge_label() +
    geom_node_splitvar() +
    geom_nodeplot(gglist = list(geom_density(aes(x = Ozone)),
                                scale_y_continuous(breaks = c(0, 0.04))),
                  ids = "terminal")
#> Warning: Removed 1 rows containing missing values (geom_segment).
#> Warning: Removed 1 rows containing missing values (geom_label).
#> Warning: Removed 5 rows containing missing values (geom_label).

Created on 2019-02-04 by the reprex package (v0.2.0).

martin-borkovec commented 5 years ago

reworked geom_nodeplot so that it should work more smoothly now... but viewports are tricky. you can now specify terminal_size in the call to ggparty as the proportion of the plot used for the terminal nodeplots.

there are still some issues with the axes labels interfering with the tick marks though. I am gonna look into that soon.

library("ggparty")
#> Loading required package: ggplot2
#> Loading required package: partykit
#> Loading required package: grid
#> Loading required package: libcoin
#> Loading required package: mvtnorm

airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq)

a <- ggparty(airct, terminal_space = 0.5)
b <- ggparty(airct, terminal_space = 0.5, horizontal = TRUE)

a +
  geom_edge() +
  geom_edge_label() +
  geom_node_splitvar() +
  geom_nodeplot(gglist = list(geom_density(aes(x = Ozone)),
                              scale_y_continuous(breaks = c(0, 0.04))),
                ids = "terminal")
#> Warning: Removed 1 rows containing missing values (geom_segment).
#> Warning: Removed 1 rows containing missing values (geom_label).
#> Warning: Removed 5 rows containing missing values (geom_label).

b +
  geom_edge() +
  geom_edge_label() +
  geom_node_splitvar() +
  geom_nodeplot(gglist = list(geom_density(aes(x = Ozone)),
                              scale_y_continuous(breaks = c(0, 0.04))),
                ids = "terminal")
#> Warning: Removed 1 rows containing missing values (geom_segment).
#> Warning: Removed 1 rows containing missing values (geom_label).
#> Warning: Removed 5 rows containing missing values (geom_label).

Created on 2019-03-12 by the reprex package (v0.2.1)

HeidiSeibold commented 5 years ago

Looks much better already, thanks!

martin-borkovec commented 5 years ago

This is a bug that occurs, because the width of the area reserved for the tickmarks of the y axis is calculated using the aggregated data of all nodes. I assumed this way the axes (resp. also the tickmarks) would be able to encompass all the observations. With density that's obviously not the case, since the density of the aggregated data never reaches 0.04. Therefore, because of your specification of the breaks during said calculation there are no tick marks present, so thats why the specified area is not wide enough.

Luckily there's an easy fix, if you don't insist on exactly these breaks. Just add a break, that happens to be reached by the density, like 0.02 or just ommit the breaks argument completely,

By the way, it's now possible to create shared axis labels, which should leave more space for the plots.

library("ggparty")
#> Loading required package: ggplot2
#> Loading required package: partykit
#> Loading required package: grid
#> Loading required package: libcoin
#> Loading required package: mvtnorm

airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq)

b <- ggparty(airct, terminal_space = 0.5, horizontal = TRUE) +
  geom_edge() +
  geom_edge_label() +
  geom_node_splitvar() 

b + geom_nodeplot(gglist = list(geom_density(aes(x = Ozone)),
                                scale_y_continuous(breaks = c(0, 0.02, 0.04))))


b +  geom_nodeplot(gglist = list(geom_density(aes(x = Ozone))),
                   shared_axis_labels = TRUE)

Created on 2019-03-18 by the reprex package (v0.2.1)