r-devel / r-project-sprint-2023

Material for the R project sprint
https://contributor.r-project.org/r-project-sprint-2023/
17 stars 3 forks source link

Logging visual outcomes from base graphics calls #75

Open nzgwynn opened 1 year ago

ajrgodfrey commented 1 year ago

Discussion with Deepayan, we are looking at the display list to see what gets stored. Having access to information generated by plot.default() may be necessary, but not sufficient. Successive calls to plot(), points(), lines(), and abline() for example are common and show the challenges we hae identified.

There are also functions defined inside plot.*() functions, some of which may need an explicit return() in order that the main function returns a complete list(). .

From: nzgwynn @.> Sent: Thursday, August 31, 2023 1:11 AM To: r-devel/r-project-sprint-2023 @.> Cc: Subscribed @.***> Subject: [r-devel/r-project-sprint-2023] Describe traditional for BrailleR package (Issue #75)

- Reply to this email directly, view it on GitHubhttps://github.com/r-devel/r-project-sprint-2023/issues/75, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADCZKGT7EH6WDA2KWNCT6NTXX43VDANCNFSM6AAAAAA4EP7X4U. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

nzgwynn commented 1 year ago

Github for grid graphics: https://github.com/pmur002/gridgraphics

deepayan commented 1 year ago

This gives a list of calls stored on the display list:

summarizeDL <- function()
{
    p <- recordPlot()
    lapply(p[[1]], \(x) x[[2]][[1]][[1]])
}
nzgwynn commented 1 year ago

x <- runif(30, 1, 50) y <- rnorm(30, 0, 1)

Draw a plot

plot(x, y)

p <- recordPlot() p[[1]]

deepayan commented 1 year ago

Version with argument list along with the function calls:

summarizeDL <- function()
{
     getCall <- function(x) {
        structure(list(x[[2]][-1]), 
                  names = x[[2]][1][[1]]$name)
     }
    p <- recordPlot()
    lapply(p[[1]], getCall) |> do.call(what = c)
}
nzgwynn commented 1 year ago

We'll start with these:

plot(), points(), lines(), abline(), curve()

deepayan commented 1 year ago

Initial list of functions that get recorded on the display list, along with a potential list of arguments. The argument list may not be complete or correct in all cases, and will need to be verified.

There are other functions that may be called (that definitely happens for grid-based plots), so this should only be considered as a starting point.

plotc_arglist1 <- 
    list(C_plot_new = c(""),
         C_plot_window = c(""),
         C_axis = c("side, at, labels, tick, line, pos, outer, font, lty, lwd, lwd.ticks, col, col.ticks, hadj, padj, gap.axis, ..."),
         C_plotXY = c("x, y, type, pch, lty, col, bg, cex, lwd, ..."),
         C_segments = c("x0, y0, x1, y1, col, lty, lwd, ..."),
         C_rect = c("xl, yb, xr, yt, col, border, lty, ..."),
         C_path = c("x, y, col, border, lty, ..."),
         C_raster = c("image, xl, yb, xr, yt, angle, interpolate, ..."),
         C_arrows = c("x0, y0, x1, y1, length, angle, code, col, lty, lwd, ..."),
         C_polygon = c("x, y, col, border, lty, ..."),
         C_text = c("xy, labels, adj, pos, offset, vfont, cex, col, font, ..."),
         C_mtext = c("text, side, line, outer, at, adj, padj, cex, col, font, ..."),
         C_title = c("main, sub, xlab, ylab, line, outer, ..."),
         C_abline = c("a, b, h, v, col, lty, lwd, ..."),
         C_box = c("which, lty, ..."),
         C_locator = c("x, y, nobs, ans, saveans, stype"),
         C_identify = c("ans, x, y, l, ind, pos, order, Offset, draw, saveans"),
         C_strHeight = c("?"),
         C_dend = c("?"),
         C_dendwindow = c("?"),
         C_erase = c("?"),
         C_symbols = c("x, y, type, data, inches, bg, fg, ..."),
         C_xspline = c("sx, sy, ss, col, border, res"),
         C_clip = c("x1, x2, y1, y2"),
         C_convertX = c("from, to"),
         C_convertY = c("from, to"))

## split on comma to get a vector of argument names
plotc_arglist2 <- lapply(plotc_arglist1, \(x) trimws(strsplit(x, ",")[[1]]))
ajrgodfrey commented 1 year ago

Summary: Much progress made in pretty short order. R Sprint work is completed; plenty of work remains to fully integrate results into the BrailleR package. results will be placed in the BrailleR package until such time as a second use case is established. Infrastructure cold then be moved to a more useful place. New functionality pulls useful information from the display list. Processing of this information requires development of the text description template.

To-do: share outcome once BrailleR implementation work is complete.

Risks to R: None

Thanks: Deepayan (obvious) but some additional support received from Thomas L and Gwynn

Learnings: lots, for Jonathan and Gwynn

pmur002 commented 1 year ago

I was going to suggest a 'gridGraphics'-based approach, but all 'gridGraphics' is doing is traversing the display list and looking at the information stored there, which is what it looks like Deepayan has suggested you do.

One complication you might want to consider is that a call to plot() may generate more than one plot on the page, e.g., plot.lm(). To allow for this, 'gridGraphics' generates a hierarchy of viewports and has an explicit naming scheme (see the "Naming Schemes" section in the R Journal article) so that we can differentiate between separate plots. You may have to do something analogous.

deepayan commented 1 year ago

Longer term, we may want to add a summary.recordedplot() function in base R that extracts a minimal set of information from the display list in a non-scary format (one that doesn't involve dotted pair lists and external pointers). This would provide BrailleR and other add-on packages a predictable and documented API to work with.

MichaelChirico commented 1 year ago

+1 to Deepayan's comment. I had the same thought during the chat re: possible regression tests for #74. Getting a nicer object would facilitate some basic regression tests for plots.

nzgwynn commented 1 year ago

I'm interested in working on that.

ajrgodfrey commented 1 year ago

Test bed now includes a simple linear model (specifics not important)

Using par(mfrow=c(2,2)) as is quite common leads to a single canvas with all four plots, all of which are captured in a single list of 61 C calls.

Not using a single canvas leads to only the last of the four plots being retained in the results of recordPlot(), and a much shorter list of C calls.

deepayan commented 1 year ago

Some potentially useful help pages:

?dev.control
?setHook
?plot.new
deepayan commented 1 year ago

Task callbacks: For recording what has happened in the last call:

?taskCallbackManager

Reference: https://developer.r-project.org/TaskHandlers.pdf