daqana / tikzDevice

A R package for producing graphics output as PGF/TikZ code for use in TeX documents.
https://daqana.github.io/tikzDevice
132 stars 26 forks source link

Unnecessarily high precision of paths leads to large file sizes #145

Closed hbuschme closed 8 years ago

hbuschme commented 8 years ago

Plotting distributions (e.g., with ggplot2's geom_density) results in a TikZ-path with a huge number of coordinates. The TikZ-file created in the following MWE

library(tikzDevice)
library(ggplot2)

data(tips, package = "reshape")

testplot <- ggplot(tips, aes(x=tip)) + geom_density();

tikz(file = "test.tikz", width=4, height=4)
print(testplot)
dev.off()

testplot

for example contains in a TikZ-path for the distribution curve that has more than 1000 samples. This is – at least for my use case – overly precise. The result is a huge file size (both the .tikz and the compiled .pdf file) when drawing multiple such curves. I have not yet run into problems where LaTeX cannot compile the file anymore (see #103), but I guess it could happen.

My suggestion would be to add an option where the precision used for such paths can be controlled, e.g., by configuring tikzDevice such that it skips three out of four coordinates when drawing such paths.

Or is this something that needs to be configured when plotting and directly in R?

krlmlr commented 8 years ago

geom_density() calls stat_density() calls stats::density(), which does have an n argument that defaults to 512, but I don't see it's exposed in geom_density() or stat_density(). Perhaps suggest an improvement to ggplot2, or compute the density yourself?

I don't think 512 is a terribly huge number of points, too.

If you prefer raster images, you can still use TikZ to create a PDF and then convert it to PNG with your desired density using e.g. convert from ImageMagick. There's a brand-new R interface, too.

The graphics driver is most certainly not the place where this change should be made.

hbuschme commented 8 years ago

Thanks for the explanation, I can now see that such a feature does not belong into the tikzDevice package.

As you suggested, I made a change in ggplot2 that makes the argument accessible to users of stat_density() (and hence geom_density()). Once this change makes it into ggplot2, the precision of a density plot can be controlled in the following way:

library(ggplot2)

data(tips, package = "reshape")
ggplot(tips, aes(x=tip)) + geom_density(stat=StatDensity, n=64);

UPDATE: The change has now been merged into the ggplot2 master branch.