JuliaPlots / Plots.jl

Powerful convenience for Julia visualizations and data analysis
https://docs.juliaplots.org
Other
1.83k stars 354 forks source link

dpi on backends #733

Open mkborregaard opened 7 years ago

mkborregaard commented 7 years ago
  1. The dpi of the pyplot backend has changed with the shift to matplotlib 2.0. Plots should adjust accordingly.
  2. gr and plotlyjs both ignore the dpi argument
  3. The pyplot backend seems to have the dpi argument backwards:
    using Plots; pyplot()
    scatter(1:10,1:10) # dpi = 100 as default

    skaermbillede 2017-03-15 kl 10 51 59

    scatter(1:10, 1:10, dpi = 150) # This should result in smaller elements

    skaermbillede 2017-03-15 kl 10 52 17

An earlier merged PR claimed to fix this (https://github.com/JuliaPlots/Plots.jl/pull/517/files) but it hasn't, it seems. It did do something, though, reverting that PR gives

scatter(1:10, 1:10, dpi = 350)

skaermbillede 2017-03-15 kl 10 55 18

ma-laforge commented 7 years ago

Hi @mkborregaard:

From my own experiments: I don't think sufficient thought was ever put into the DPI question.

It seems to me that DPI does not really have a place the way Plots.jl was defined (it uses different knobs to control how plots look):

AFAIK:

When would I expect Plots.jl to use DPI? Ignoring font sizes: I would expect DPI to be used only if a user wanted to export a pixel-based plot image by specifying (w, h) using physical dimensions like inches, or centimeters. In that case, you kind of need a DPI value in order to figure out how many pixels to use.

EDIT: Conversely, the DPI value could simply be added to image as an informative-only meta-tag to tell programs like word processors how large to make the image when it first gets inserted in the document. However, this value would not, affect the sizes of the symbols or line widths relative to the total image size.

FYI: I still have an open issue trying to understand the intent of DPI in Plots.jl: https://github.com/JuliaPlots/Plots.jl/issues/653

Once I understand what Plots.jl intended backends to do with the DPI value, I would probably be able to help fix the PyPlot issue.

tbreloff commented 7 years ago

Who says there was anything intended? :) This is a tricky item to generalize, but we already have the components needed. I agree that unitless numbers generally correspond to points for fonts and pixels for other things.

Ideally the sizes could be any unit and calculations would lazily wait until final output. This is what Measures is for, and one of the best parts of Compose and family. They take the convention that mm is the "base" unit. I generally prefer that pixel is the base unit because my output is generally to the screen. Neither is inherently wrong, but correctly converting between units is a good thing to get right.

Btw... PyPlot sucks for consistency... did I mention I want to deprecate it?

On Wed, Mar 15, 2017 at 6:54 PM ma-laforge notifications@github.com wrote:

Hi @mkborregaard https://github.com/mkborregaard:

From my own experiments: I don't think sufficient thought was ever put into the DPI question.

It seems to me that DPI does not really have a place the way Plots.jl was defined (it uses different knobs to control how plots look):

AFAIK:

  • DPI is a way to relate pixel/image size to physical dimensions (ex: 1000 pixels represents 10 cm) - thus the typical units "pixels per inch".
  • Plots.jl defines the canvas size (w, h) in "pixels", and I think the intent is for the user to specify line widths, etc in "pixels" as well. That's how the Cairo backend works by default - but I admit I could not really confirm if that was how Plots.jl was meant to operate.
  • I admit: fonts are typically drawn in "points" (a physical unit measuring 1/72 inch) - so a DPI ratio should, in theory be affected by DPI. In Cairo, I believe there is a default or implied DPI value used to control font sizes somewhere - but I personally have never changed it.

When would I expect Plots.jl to use DPI? Ignoring font sizes: I would expect DPI to be used only if a user wanted to export a pixel-based plot image by specifying (w, h) using physical dimensions like inches, or centimeters. In that case, you kind of need a DPI value in order to figure out how many pixels to use.

FYI: I still have an open issue trying to understand the intent of DPI in Plots.jl: #653 https://github.com/JuliaPlots/Plots.jl/issues/653

Once I understand what Plots.jl intended backends to do with the DPI value, I would probably be able to help fix the PyPlot issue.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JuliaPlots/Plots.jl/issues/733#issuecomment-286905152, or mute the thread https://github.com/notifications/unsubscribe-auth/AA492opgYdXY5vi9A_0SBisTqHqZeaQjks5rmGwZgaJpZM4MduM6 .

mkborregaard commented 7 years ago

My original post here stemmed from not understanding dpi. I understand it now. It is intended to be used with size. So, calling scatter(1:10, 1:10, size = (600, 800)) and scatter(1:10, 1:10, size = (1200, 1600), dpi = 200) will result in identical-looking plots with different dimensions. Which matters when calling savefig afterwards (but in fact has nothing to do with inches). I see the sense of this behaviour, so I think of my original post as uninformed (though maybe it could be considered to implement this behaviour for the other backends as well, if it's desired). @ma-laforge does this make sense to you too?

ma-laforge commented 7 years ago

@tbreloff: I generally prefer that pixel is the base unit because my output is generally to the screen. Neither is inherently wrong, but correctly converting between units is a good thing to get right.

Agreed. Nether is inherently wrong, but I found it practical/easy to use pixels for that exact reason:

When I started using Cairo, I noticed the base transform mapped directly to pixels... and found it much easier to work this way than what I did in the past (which was to draw everything in millimeters - a physical quantity).

Though working in millimeters is fine in theory, it is just annoying to get plots looking good/being readable on the screen. You basically have to figure out how many millimeters gives you a x-pixel line width.

My vote I vote for image size, trace widths, and "symbol" sizes to be in pixels. I am still on the fence about font sizes, though.

We are indeed used to font sizes in "points" (a physical quantity), but that's not really meaningful on a plot unless we specify the plot size using physical quantities as well (ex: inches) or assigning a DPI value.

ma-laforge commented 7 years ago

@mkborregaard: [...] dpi [...] is intended to be used with size

Yes, the two are typically used together in order to control image size/resolution.

@mkborregaard: So, calling scatter(1:10, 1:10, size = (600, 800)) and scatter(1:10, 1:10, size = (1200, 1600), dpi = 200) will result in identical-looking plots with different dimensions. Which matters when calling savefig afterwards (but in fact has nothing to do with inches).

That is indeed what I would expect if the default resolution was set to dpi = 100. But that's not what I see at the moment with the pyplot backend.

The main (but slight) correction I would make to your statement is that the differences would not only matter when you call savefig: The second solution should also display as 2x larger on the screen, and have 2x wider lines, etc. ... but I get what you mean: They would both like exact scaled copies of each other (less enhancements you get from anti-aliasing).

@mkborregaard: (but in fact has nothing to do with inches)

To a first order, I would agree. But I think many image formats have a metadata tag that tells programs how big the image is supposed to be drawn - in some physical unit like inches, or millimeters, or something. This does not really change the quality - or overall look - of the image, though. It simply tells the end user application how big to draw things (mostly useful when you intend to print the image in the real world).

@mkborregaard: though maybe it could be considered to implement this behaviour for the other backends as well, if it's desired

Indeed, that's why I opened issue https://github.com/JuliaPlots/Plots.jl/issues/653.

...But then again, it is not that critical of a feature: The user of Plots.jl can basically get the same effect by doubling the size = (xsize, ysize) value and doubling the line withs, symbol withs (, etc) everywhere else in their code.

mkborregaard commented 7 years ago

That is indeed what I would expect if the default resolution was set to dpi = 100. But that's not what I see at the moment with the pyplot backend.

Yes, something is off with the dpi scaling - it exaggerates the size of all elements too much, so e.g. setting dpi = 120 gives a similar image when quadrupling(!) the pixel number. Definetely a (minor) bug. Worth looking into this code: https://github.com/JuliaPlots/Plots.jl/blob/2317b77314b7fbf6fd17ca0628d6da3986e4d8ef/src/backends/pyplot.jl#L379-L381

ma-laforge commented 7 years ago

I personally have never quite understood when that py_dpi_scale function should have been used. It currently appears to be scaling all dimensions sent to pyplot.

This is kind of odd to me, because the :dpi setting is already sent to matplotlib in the _show function: https://github.com/JuliaPlots/Plots.jl/blob/master/src/backends/pyplot.jl#L1251-L1262

mkborregaard commented 7 years ago

I think that is only for savefig. It is passed to the application window here I think https://github.com/JuliaPlots/Plots.jl/blob/master/src/backends/pyplot.jl#L990 Anyway, I almost think the only way for this to progress is if someone sits down and investigates the behavior on matplotlib and then makes a PR for the dpi behaviour that is either consistent with that or has a good internal Plots logic :-)

ma-laforge commented 7 years ago

Note that there is a quadratic relationship to control the marker sizes as well. I guess means someone in the call chain decided that markersize was a measure of area instead of height or width.

https://github.com/JuliaPlots/Plots.jl/blob/2317b77314b7fbf6fd17ca0628d6da3986e4d8ef/src/backends/pyplot.jl#L599

Maybe that is the way to go, but I find most people find it easier to think linearly than in terms of quadratics.

mkborregaard commented 7 years ago

I think it is the other way around - markersize in Plots is proportional to the diameter of the marker. This is (unfortunately) common behaviour for plotting packages. When used in a plot to designate a quantity, that quantity should be proportional to the area of the marker - so I always set ms using sqrt(my_variable).

Not sure about the relationship with DPI.

ma-laforge commented 7 years ago

Anyway, I almost think the only way for this to progress is if someone sits down and investigates the behavior on matplotlib and then makes a PR for the dpi behaviour that is either consistent with that or has a good internal Plots logic :-)

That sounds reasonable to me. Do we have a "reference" backend that we can use for this?

By this, I mean is there a backend that the Plots.jl group put in alot of attention to get the dimensions & scaling working the way? If so, other backends should be using this as a reference to calibrate their marker sizes, line widths, etc. ...And yes, also to get the DPI setting to behave the same as well.

I say this, because it is my understanding that the (possibly loosely defined) goal of Plots.jl is to provide backend-agnositc plot facilities. In other words, we should get more-or-less the same results, irrespective of which backend the user selects.

Is GR the reference at the moment? It sounds like GR is getting a bit more attention lately - given speed & active development. I know people seem to might want GLVisualize to become the reference, but it sounds like it is not at that point yet.

I used to thing PyPlot was the reference, but it sounds like it is getting alot less attention lately.

mkborregaard commented 7 years ago

AFAIK dpi is only implemented for pyplot at the moment.

ma-laforge commented 7 years ago

I think it is the other way around - markersize in Plots is proportional to the diameter of the circle. This is (unfortunately) common behaviour for plotting packages. When used in a plot to designate a quantity, that quantity should be proportional to the area of the marker - so I always set ms using sqrt(my_variable).

I don't think I understand: Do you mean there is a lack of seperation between when we use "markers" as simple data (position) markers vs when we generate plots that use "markers" as a way to convey the size of something at a given position?

mkborregaard commented 7 years ago

No - I mean like this https://www.bloomberg.com/graphics/2017-trump-budget/ The size of each circle equals the amount in $. To do this you'd need to do plot(x,y, ms = sqrt(expense)).

ma-laforge commented 7 years ago

Sounds like we are saying the same thing in different ways, so I recap.

sqrt(expense): AFAICT: You are doing this in order for your symbol area to be proportional to expense.

You probably find this annoying a bit because matplotlib's default behaviour is for marker size to represent a value proportional to area (what you personally find intuitive) - but Plots.jl decided that :markersize represents diameter (not area).

Justification I say this because :markersize gets squared before being passed to matplotlib (https://github.com/JuliaPlots/Plots.jl/blob/2317b77314b7fbf6fd17ca0628d6da3986e4d8ef/src/backends/pyplot.jl#L599).

As a result, (at least when using the pyplot backend) Plots.jl implicitly expects the user to specify a marker size that is proportional to diameter - so it has to square things up to translate to matplotlib's requirement to have marker size be proportional to area.

ma-laforge commented 7 years ago

About :markersize being proportional to diameter (how Plots.jl behaves at the moment):

I personally prefer this myself - because I would agrgue most people expect the bounding box of a marker to double when they doubled :markersize.

Moreover, markers that are made of pure line art (like the empty + & x markers) don't really have an area to speak of. In these instances, trying to make :markersize relate to area is sort of a lost cause.

I would also argue that most people use markers to simply locate data points, Most plots don't use marker size to convey a z quantity - like an expense.

--> In this particular case, I would expect to use a plots reciepe to convey the z values using marker sizes. With such a reciepe, you might even be able to be able to normalize the area of an octagon, square, star, filled cross, etc so that areas can be consistent across maker types.

mkborregaard commented 7 years ago

Oh, I see. matplotlib had the right idea here. I didn't realize that. All the backends of Plots specify the diameter, though (I tried gr, pyplot, plotlyjs and inspectdr). I realize that it is different what people find intuitive, of course. The issue is that this will make the naive user create markersizes that go against the established convention if using different markersizes in the same plot. That is unfortunate but perhaps hard to change now.

mkborregaard commented 7 years ago

... although I'd wanted to suggest a revision that made a given markersize denote the same markersize across backends anyway...

ma-laforge commented 7 years ago

... although I'd wanted to suggest a revision that made a given markersize denote the same markersize across backends anyway...

Agreed, this is the point I was trying to to make. The DPI problem is actually just a small annoyance at the moment. It is the overall consistency that we are missing.

mkborregaard commented 7 years ago

To my mind, such a (needed) revision should include changing the behaviour for markersize.

ma-laforge commented 7 years ago

@tbreloff: Is the :-1 because you don't think we should be making all backends more consistent - or is it because you also prefer having marker size be proportional to the diameter?

mkborregaard commented 7 years ago

Right, @tbreloff , your call.

tbreloff commented 7 years ago

I hate the area-based-size thing. Personal preference maybe, but it's more commonly radius-based I think.

Also, there's not a strong reason to break every plot ever made...

mkborregaard commented 7 years ago

OK, fair enough. Adjusting markersize to be consistent across backends would also make every plot ever made look different (but agreed, changing the interpretation of markersize would be more tricky, especially for people who did the right thing).

You probably find this annoying a bit because matplotlib's default behaviour is for marker size to represent a value proportional to area

No, I come from R that also has the diameter-behaviour. The reason I find it annoying is because of 10 years of telling students they need to put sqrt in front of markersize if they use it for anything.

ma-laforge commented 7 years ago

That's my opinion as well (also simply a personal preference)...

However, I see no real reason why we could not eventually solidify at least one of the backends as a reference for what the different sizes mean - and try to get the backend maintainers to match that behaviour as closely as possible. It sounds like @mkborregaard agrees with this part.

ma-laforge commented 7 years ago

No, I come from R that also has the diameter-behaviour. The reason I find it annoying is because of 10 years of telling students they need to put sqrt in front of markersize if they use it for anything.

I know I don't fully understand how to build/use plot recipes yet, but could you not build an "officially supported" plot recipe that can be used to generate that type of plot?

mkborregaard commented 7 years ago
@userplot GetSizesRightInMyOpinion
@recipe function f(x::getsizesrightinmyopinion)
  markersize := sqrt(d[:markersize])
  x.args...
end
ma-laforge commented 7 years ago

@mkborregaard: I don't quite like how I answered your question regarding dpi & scatter(1:10, 1:10, size = (600, 800)). I tried to oversimplify and it might have caused a bit of confusion.

Sorry for being a bit pedantic here...

Background There are 2 important dimensions in images:

Of course, DPIs are used to relate the Dimg to Dphy.

Sample cases

NOTE: I am omitting font sizes from my examples at the moment (because the units used for font sizes typically complicate things a bit).

Dimg line widths, Dimg plot size

In other words: both line widths & image size are in pixels.

In this case, the DPI value basically changes nothing. Since all element sizes of the plot are controlled using number of pixels, setting a DPI value does nothing to the final image.

In this case, the DPI value can only be used to tell the end user application how large to make the image realtive to the page size @ image import.

Dphy line widths, Dimg plot size

Ex: line widths in mm, but image size in pixels.

In this case, the DPI value simply controls how many pixels are used to draw a line of xmm. The larger the DPI value, the thicker the line appears relative to the total image size.

Again, since the image size is specified in pixels, the DPI value can also be used to tell the end user application how large to make the image realtive to the page size @ image import.

Dphy line widths, Dphy plot size

Ex: line widths & image size specified in mm. In this case, there is no longer a real tie to pixels until we specify the DPI value. In fact, DPI is the only thing telling you how many pixels you will have when you generate the image.

From the user's perspective, this solution has similar advantages as when the line widths & plot size are both specified in in pixels: The user can more easily predict how things will look when rendered onto a canvas because all dimensions are specified with the same units.

The small downside to using physical dimensions (ex: mm) is that the user now has to figure out how large to specify line widths so that they look as expected on a computer screen.

And, in this case, the DPI value no longer affects how the end user application sizes the plot realtive to the page size @ image import... Because the physical image size was directly controlled by the user when they specified the plot size.

mkborregaard commented 7 years ago

Phew, you are a fast typist - do you have a magical keyboard?

So, I agree with your ideas on homogenizing dpi and markersize, but then again others don't always share my intuitions :-) My suggestion would be:

  1. Make sure that @tbreloff would be interested in merging a revision of this; and then
  2. Try to put it in a PR. Should be easy for you.
ma-laforge commented 7 years ago

@mkborregaard: Try to put it in a PR. Should be easy for you.

Before I can do this, I need to know if I understand the intent for Plots.jl (I need to know where I am going).

AFAICT: @tbreloff wanted both line widths & final plot sizes to be specified in pixels. I am more than ok with this. This appears to be how Cairo's default transform works, and I find it works well for developing GUI-capable plots.

I am curious about the history of supporting DPI, though. Why was it added to the pyplot backend?

  1. Was it because someone though it was the best way to increase the plot resolution (instead of setting the size=(xsize, ysize) value)? If so - I think it should be deprecated.
  2. Was it because someone noticed that font sizes were specified using (typography) points (technically a physical dimension) instead of pixels like everything else - and wanted a knob to quickly grow/shrink all titles and labels?
  3. Maybe that person just wanted to have a way to get matplotlib to modify the .png file's metadata tag that tells end-user applications how large to display the plot when imported.
  4. ...?

I ask because reason 1 is actually invalid if plot size=(xsize, ysize) is specified in pixels - as I assume is the case in Plots.jl

Moreover, I argue that although reasons 2 & 3 are valid, they are currently adding alot of confusion to support features that are not that critical (at least not at this stage in the development of Plots.jl).

mkborregaard commented 7 years ago

Cool. The only person who can answer this is @tbreloff (he wrote all the code).

mpastell commented 7 years ago

Personally I'd like to have dpi saved to metadata when calling savefig with all backends. There are several tools that make use of this data and its annoying not to have it. Pandoc is one widely used example.

ma-laforge commented 7 years ago

Submitted PR https://github.com/JuliaPlots/Plots.jl/pull/740.

I kept control over DPI in order to satisfy request from @mpastell - and because it required minimal changes wrt the current implementation.

PS I looked a the way Cairo works a bit more in detail: It appears that the font size supposedly is controlled in user units (which default to 1 pixel high with Cairo's initial transform). This matches the units used to draw lines in Cairo (so both lines & fonts are really specified in pixels instead of some other physical dimension).

...But I have not yet found a way to specify the final physical dimensions of a plot or DPI value from within Cairo.

Sorry @mpastell - I don't know how to get Cairo (graphics engine used by inspectdr() backend) to modify the metadata to set the plot's physical dimensions. It appears to be stuck at 72 dpi.

ma-laforge commented 7 years ago

Side comment

I am still not convinced giving users control of DPI was the way to go: most people just get confused when they start talking DPI for pixel-based line art like this.

Instead, I think Plots.jl should get an extra field that specifies the plot's physical dimensions - a field that will only modify the png file's metadata.

Consequently, I think that the font size should be specified in pixel height instead of the typical unit: the typography point (1/72"). That way, all elements of the plots are controlled by values that are proportional to each other (they are all pixel height/width values).

...But we might need to be careful not to give the user too much "control": I am not convinced .png files support arbitrary (non-square) pixel aspect ratios.