Optimize out useless clipping

zackw commented 12 years ago

Grid-based graphics tend to produce tons of useless clipping (that is, clipping scopes within which nothing is drawn). Even just calling grid.newpage by itself...

suppressPackageStartupMessages({
  library(tikzDevice)
  library(grid)
})
tikz("grid1.tex", standAlone=TRUE, width=2, height=2)
grid.newpage()
invisible(dev.off())

produces a picture environment containing two useless empty clipping scopes:

\begin{tikzpicture}[x=1pt,y=1pt]
\definecolor[named]{drawColor}{rgb}{0.00,0.00,0.00}
\definecolor[named]{fillColor}{rgb}{1.00,1.00,1.00}
\fill[color=fillColor,fill opacity=0.00,] (0,0) rectangle (144.54,144.54);
\begin{scope}
\path[clip] (  0.00,  0.00) rectangle (144.54,144.54);
\end{scope}
\begin{scope}
\path[clip] (  0.00,  0.00) rectangle (144.54,144.54);
\end{scope}
\end{tikzpicture}

Doing anything more complicated can produce dozens of these useless clips: for instance, the output of

library(ggplot2)
tikz("grid1.tex", standAlone=TRUE, width=2, height=2)
qplot(1, 1, geom="point")
invisible(dev.off())

contains 157 of them.

This is, strictly speaking, grid's fault rather than tikzDevice's, but I can imagine that it might be easier to fix at the device level. Bonus points for also eliminating clipping paths that are useless not because nothing is drawn, but because everything that is drawn is strictly inside the clip.

Sharpie commented 12 years ago

Currently, this isn't possible. The function calls in the tikzDevice which write TeX commands to the output file are just the last step in a long pipeline of function calls in the R graphics system that make sense of commands like "draw a circle" or "start a clipping scope" and convert them to TikZ code.

In order to do something like this, the device would have to push all graphics calls onto a stack of requested operations and defer writing to a file until the plot was closed or newpage is called. Then some optimization passes could be applied that would re-write the operation stack before it was flushed out to the output file.

I have thought about doing this as it would allow for things like centralizing all color definitions at the beginning of the graphic so that they wouldn't have to be repeated multiple times.

However, moving to such a system would require a massive re-write of the device and that probably won't happen until after version 1.0 is released.

zackw commented 12 years ago

Understood. If you ever get around to wanting to do that rewrite, I might be interested in helping; however I cannot predict my time commitments very far into the future (and right now I don't have time to do more than file bug reports).

zw

Sharpie commented 12 years ago

On second thought, it may be possible to do this by having TikZ_Clip set a flag indicating clipping is requested instead of printing output. Then every function call that actually draws stuff, such as TikZ_Line, would print the clip commands if the flag was set and then clear it.

Not as clean of a solution as storing and optimizing the operation stack, but it would remove a bunch of useless clutter from the output without much change in the implementation.

Sharpie commented 12 years ago

Allright, this has been implemented in f43d95d4cabd5058c0367eb77e4e43c45c12e0ab. In some cases, the number of lines in the output was reduced by nearly 75%.

This should make it much easier to make sense of tikzDevice output. Thanks for the suggestion!

yihui commented 12 years ago

This sounds so very cool!! Thanks a lot for the wonderful work, Charlie!

Sharpie / RTikZDevice

Optimize out useless clipping #45