Wishlist: pointrange and ribbons

grantmcdermott commented 1 year ago

Especially for regression coefficients.

See the nice answer by @karoliskoncevicius here: https://stackoverflow.com/a/75971206/4115816

(HT @vincentarelbundock)

Tracking:

[x] Point range (#35, #38)
[x] Error bars (#39)
[x] Ribbons

grantmcdermott commented 1 year ago

Thinking about this, the nicest user interface might be to enable NSE with a data argument. I'm not sure how this would interact (conflict?) with the existing plot2.default and plot2.formula methods. But this NSE style dispatch would be a cool alternative that provides a nice interface for other plot types that don't take a "standard" set of arguments. (E.g. Density plots, where we've currently been discussing a one-side formula option #21).

Proof of concept:

pointrange = function(
    x,
    y,
    ymin,
    ymax,
    data
  ) {

  ## NSE ----
  nl = as.list(seq_along(data))
  names(nl) = names(data)
  x = eval(substitute(x), nl, parent.frame())
  if (is.numeric(x)) x = data[,x]
  x_seq = seq_along(x)
  y = eval(substitute(y), nl, parent.frame())
  if (is.numeric(y)) y = data[,y]
  ymin = eval(substitute(ymin), nl, parent.frame())
  if (is.numeric(ymin)) ymin = data[,ymin]
  ymax = eval(substitute(ymax), nl, parent.frame())
  if (is.numeric(ymax)) ymax = data[,ymax]

  plot.new()
  plot.window(xlim = range(x_seq), ylim = range(c(min(ymin), max(ymax))))
  grid()

  points(x_seq, y, pch = 16)
  segments(x_seq, ymin, x_seq, ymax)

  axis(1, at = x_seq, x)
  axis(2, las = 2)

  title(ylab = "Estimate")
}

Example use:

mod = lm(mpg ~ hp + factor(cyl), mtcars)
coefs = data.frame(names(coef(mod)), coef(mod), confint(mod)) |>
  setNames(c("x", "y", "ymin", "ymax"))

pointrange(x = x, y = y, ymin = ymin, ymax = ymax, data = coefs)

vincentarelbundock commented 1 year ago

Not sure about introducing NSE. The base function doesn't have it, right?

Feels like this introduces code complexity and departs from the spirit of the original.

Isn't it easy enough to call with(plot2())?

zeileis commented 1 year ago

Standard non-standard evaluation via formula+data interfaces is already messy enough. Non-standard non-standard evaluation without formulas is a can of worms, I would prefer to avoid. Otherwise you have to worry about transformed variables again, NA handling, etc.

Plus pointrange() would rather be helper function, wouldn't it? For the user something like plot2(tidy(...)) or plot2(coeftest(...)) or something like that would be easier, I guess. And then you can easily avoid using non-standard evaluation by computing the necessary variables explicitly in the plot2() method.

grantmcdermott commented 1 year ago

Isn't it easy enough to call with(plot2())?

The difficulty, at least as far as I can see, is that there's no sensible way to specify a self-contained formula syntax that allows ymin and ymax for the error bars.

Ofc vanilla plot() doesn't take ymin/ymax arguments either, though there's nothing stopping us from supporting them with plot2() (or some helper function like pointrange). The following isn't super user-friendly from my view plot2(x = dat$term, y = dat$term, ymin=dat$conf.low, ymax=dat$conf.high)... but maybe Vincent's idea to use with instead gets around that.

Regardless, I hear you both on the potential pitfalls of NSE, particularly when it comes to transformed variables. Let's revisit once we have the palette and legend defaults finalized, and also the faceting stuff.

grantmcdermott / tinyplot

Wishlist: pointrange and ribbons #25