ProjectMOSAIC / ggformula

Provides a formula interface to 'ggplot2' graphics.
Other
39 stars 11 forks source link

Make gf_function use additional information of objects of class `stepfun` #114

Open lahvak opened 5 years ago

lahvak commented 5 years ago

Objects of class stepfun are step functions with some information about left/right continuity at the jump points. A good example is the epdf function, the empirical probability density function of a sample.

When plotting such functions with base graphics, the plot shows the discontinuities as steps without vertical connecting segments, with dots indicating continuity, unless there are too many dots, in which case only the horizontal segments are plotted. Also, the default x-limits are obtained from the information provided by the object.

When such function is passed to gf_function, currently it is plotted as a "continuous approximation" of the step function, with very steep segments in place of the discontinuities. It would be nice if the extra information provided by the object was used by gf_function.

Compare

gf_function(ecdf(c(1, 3, 4, 7)), xlim = c(0, 8))

with

plot(ecdf(c(1, 3, 4, 7)))
rpruim commented 5 years ago

I've been thinking about this a bit. I don't think there is anything in native ggplot2 to automate this. There may be something in one of the many "add-on" packages, but I'm not aware of it. geom_step() exists, but (a) is for a different job, and (b) always draws the vertical line segments.

In principle, all the things we need are available, we just need to convert the function into a data frame and use gf_segment():

library(ggformula)
theme_set(theme_bw())
fortify.stepfun <-
  function (model, xval, xlim, 
          do.points = (n < 1000), ...)
{
  x <- model
  knF <- knots(x)
  xval <- knF
  rx <- range(xval)
  dr <- if (length(xval) > 1L) 
    max(0.08 * diff(rx), median(diff(xval)))
  else abs(xval)/16
  xlim <- rx + dr * c(-1, 1)

  xval <- xval[xlim[1L] - dr <= xval & xval <= xlim[2L] + dr]

  ti <- c(xlim[1L] - dr, xval, xlim[2L] + dr)
  ti.l <- ti[-length(ti)]
  ti.r <- ti[-1L]
  y <- x(0.5 * (ti.l + ti.r))
  n <- length(y)
  Fn.kn <- x(xval)
  return(
    structure(
      data.frame(x1 = ti.l, x2 = ti.r, y = y),
      xval = xval, Fn.kn = Fn.kn
    )  
  )
  }

foo <- ecdf(c(1, 3, 4, 7))
gf_segment(y + y ~ x1 + x2, data = fortify(foo)) %>%
  gf_point(y ~ x1) %>%
  gf_point(y ~ x2, shape = 1)

Created on 2018-11-14 by the reprex package (v0.2.0). The work would be in nicing up the UI and dealing with annoying details like what happens if some of the segments are not entirely within the viewing window.

Alternatively, we could change either gf_function() or gf_fun() to handle a list of discontinuities, and use the knots of a step function to define the default discontinuities.

Note: Step functions in R do not record "where the dots go". To plot arbitrary step functions that allow for closure on either end of a segment would require working with something other than an R stepfun. Adding dots at the left end of each step is simpler, but should not be the default, I think.