JuliaPlots / Plots.jl

Powerful convenience for Julia visualizations and data analysis
https://docs.juliaplots.org
Other
1.84k stars 355 forks source link

[FR] Avoid gratuitous use of transparency #4940

Open mgkuhn opened 5 months ago

mgkuhn commented 5 months ago

I long noticed two problems when printing out PDFs of scientific papers containing PDF vector plots made by Plots.jl:

Problem 1:
pages containing even very basic plots take unusually long to render in the printer
Problem 2:
on pages containing plots the surrounding main text looks slightly darker and fuzzier than on pages without plots

This occurs when I use lpr paper.pdf on printers (e.g. RICOH Aficio SP C440DN) driven by a Linux CUPS server (e.g., Ubuntu Linux) with a filter that uses Ghostscript's ps2write output device to convert the PDF into PostScript before sending it to the laser printer.

If I bypass Ghostscript (by printing to a CUPS “raw queue” with lpr -o raw paper.pdf), then sometimes the problem disappears, but often I get instead now

Problem 3:
the PDF sent directly to the printer doesn't print at all.

I've now finally identified the source of this problem: transparency!

Specifically, even the most simple plot, such as

using Plots
p = plot([1, 2, 1])
savefig(p, "grid-default.pdf")

features by default grid lines with p[1][:xaxis][:gridalpha] == 0.1, causing backends such as GR and PGFPlotsX to draw the major grid lines in black with just 10% transparency, instead of the equivalent Gray(0.9) colour.

The problem with transparency in vector graphics is that this is a rather complicated construct (42 pages of the PDF spec deal with it), and can require quite a lot of memory to render. In particular, whenever Ghostscript ps2write encounters any transparency in a PDF, it no longer tries to output vector graphics, and instead rasterizes the entire page and sends it as a bitmap to the laser printer, resulting in a lower resolution and fuzzier image (due to raster resampling in the printer). This is because PostScript doesn't support transparency. Also, the way a PDF rasterizer needs to implement transparency groups means that (unlike a PostScript RIP) it needs to hold several raster layers simultaneously in memory, and laser printers with limited RAM can easily get overwhelmed by this and abort the print.

Known workarounds:

  1. use Adobe Reader to convert a PDF into PostScript
  2. use Adobe Acrobat Pro to “flatten” the PDF (eliminating transparency in a vector image)
  3. flatten the Plots.jl grid lines to avoid use of transparency in the first place

Adobe Reader and Acrobat appear to contain a fancy computational-geometry algorithm to flatten transparency out of PDFs while preserving the vector graphics, but they are closed-source products that are not supported any more on Linux.

A quick demo of how to fix this in Plots.jl by setting :gridalpha=1.0 and manually flattening the foreground_color_grid to Gray(0.9) (in a way that respects the default foreground and background settings):

using Plots
using ColorBlendModes
p = plot([1, 2, 1])
savefig(p, "grid-default.pdf")

# flatten grid transparency in plot p
using ColorBlendModes
fg = Plots.fg_color(p.attr)
bg = p.attr[:background_color]
alpha = p[1][:xaxis][:gridalpha]
grid = blend(bg, fg; opacity=alpha)
plot!(p; gridalpha=1.0, foreground_color_grid=grid)

savefig("grid-flattened.pdf")

The difference is easy to spot in the file length after conversion to PostScript:

$ gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=grid-default.ps  grid-default.pdf
$ gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=grid-flattened.ps  grid-flattened.pdf
$ wc -c grid-*.ps
402199 grid-default.ps
179979 grid-flattened.ps

grid-default.ps has become a much larger (and cruder-looking) bitmap image.

Would it be possible to make this workaround the default, i.e. avoid transparency in grid lines entirely? In other words, make the default gridalpha=1.0 and minorgridalpha=1.0.

I believe transparency is an advanced feature of PDF that causes much more trouble than is justified for such a minor use in every single plot under the default settings. I would like to suggest that Plots.jl should draw lines with opacity < 1.0 only if a user specifically asks for it.

BeastyBlacksmith commented 5 months ago

Sounds reasonable. Would you like to work on it?

dd0 commented 5 months ago

I'd be interested in working on a patch to remove (default) use of transparency for gridlines. I suggest that we keep the current default values, but replace them when preprocessing arguments, flattening as in the example above:

This way, we avoid breaking existing scripts that set one of gridalpha or foreground_color_grid and use the default value for the other.

Flattening grid colours in this way should not change the resulting plot, except that (tiny) grid intersections will now have the same colour as gridlines instead of the slightly darker overlap.

Does this sound like a reasonable approach? Should the behaviour be user-configurable (e.g. via a flattengridalpha parameter defaulting to true)?

BeastyBlacksmith commented 5 months ago

That sounds like a good plan if you want to have this in a 1.x release.

It would also be fine to just remove the use of transparency on the v2 branch if you can wait for the release.

In any case this does not need to be user configurable.

mgkuhn commented 5 months ago

@dd0 That sounds like a good approach.

By the way: I used ColorBlendModes.jl because it had the required compositing formulae readily available. But since it is a quite comprehensive package, I'm not necessarily suggesting to add it as a dependency. The basic compositing formulae to flatten two RGBA values are described in Section 11.3 of the ISO 32000:2008 PDF specification, and originally come from Porter and Duff: Compositing Digital Images (SIGGRAPH '84).

dd0 commented 4 months ago

I implemented this approach, but after testing I don't think it's a complete solution due to https://github.com/JuliaPlots/Plots.jl/issues/4202: GR and PGFPlotsX draw gridlines on top of axis lines, and so if they overlap the axis line will be drawn with the blended colour.

For example, with Plots 1.40.4 and plot(sin.(0:0.1:2π), yticks=-1:0.5:1, yrange=(-1,1)) we have:

out-orig

After flattening colours, equivalent to gridalpha = 1, foreground_color_grid = RGBA{Float64}(0.9,0.9,0.9,1.0):

out

The x-axis is covered by a gridline and is now gray:

sample-orig sample

This means that we can't currently assume that the transparent gridline is always drawn on the backround and replace it with a grid/background colour blend. It seems to me that the cleanest solution would be to first fix https://github.com/JuliaPlots/Plots.jl/issues/4202.

mgkuhn commented 4 months ago

We probably also should check the code for locations where color=RGBA(0,0,0,0) is (ab)used to just indicate that something shouldn't be drawn, as that could be another way for transparency to enter a PDF backend.