Open HDembinski opened 4 years ago
If I remember correctly, the size parameter refers to the length of a side of the "reference square". Most of the markers are defined to be within that square, while some of them are defined to enclose that square (like the big diamond).
On Sat, Nov 16, 2019 at 3:41 PM Hans Dembinski notifications@github.com wrote:
Bug summary
Markers of different types ("o", "s", "*" ...) do not visually appear to be of the same size when their marker size (e.g. ms=8) is equal (matplotlib 3.1.1).
Details
Matplotlib also has a great selection of markers, but these relative sizes of these markers are not perceptually uniform, see https://matplotlib.org/3.1.1/api/markers_api.html or this example script:
from matplotlib import pyplot as plt
plt.style.use("default") import numpy as np
x = np.arange(4) y = np.ones(4)
for imarker, marker in enumerate("ospv^<>PDdX"): plt.plot(x, y + 0.1 imarker, marker=marker)
plt.show()
The square "s" and the diamond "D" appear larger than the other markers. The star "*" is the smallest, followed by the pentagon "p" and the plus "P".
[image: image] https://user-images.githubusercontent.com/2631586/68998928-f1aef080-08b8-11ea-8633-805e604aa96b.png
Expected outcome
Markers should appear uniform in size. I think for the star this is very obviously not the case. The area of the star is much smaller than for the circle, and largest for the square and the diamond.
Nevertheless, I don't think an objective geometric criterion like area can be used to make them perceptually uniform in size. I think this needs to be hand-tuned by a human to take into account how humans perceive the relative size of objects.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/matplotlib/matplotlib/issues/15703?email_source=notifications&email_token=AACHF6HFBPFUHLMTERSYNPLQUBLGBA5CNFSM4JOGZCG2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HZZ7M7A, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACHF6GKGIX3EUK5FYO266TQUBLGBANCNFSM4JOGZCGQ .
This would need a discussion if we want to aim for perceptual uniformity.
Anyway, it would be a breaking style change and could only be introduced in the context of a style makeover (likely not before 4.0). Added to the list of possible style changes #14331.
Thanks!
Moving discussion here from #16623. My thoughts is that while a "perceptually" uniform size would be amazing, we must account for a couple of non-trivial caveats:
1) How "big" a shape looks is going to be drastically affected by the plot elements around it (taught to me under the name "size contrast", see e.g. https://www.cns.nyu.edu/~david/courses/perception/lecturenotes/depth/depth-size.html). So it may or may not be hopeless to have a single number, even if phenomenological, that describes how the "size" of an "x", for example, compares to the size of a "*". Since this will likely depend on whether the other markers in the same plots are squares or circles or whatever.
2) How "big" two shapes appear in 2D also depends heavily on the context in which they are being used. For example, there are two common "modalities" in which a shape can be perceived:
a) linearly: as in your example plots above. what really matters here in your visual perception of the shape's "effective radius" in the vertical direction. I'm sure this is well approximated by the std error (in "y") of some appropriately gaussian-smoothed version of the shape or sth.
b) in area: consider the plot np.scatter(np.random.rand(100), np.random.rand(100), marker='*'); np.scatter(np.random.rand(100), np.random.rand(100), marker='+')
. Which of the two markers you perceive as "larger" on this plot is likely to be different than in the example plot you gave (although I'm sure they will correlate).
In short, how you end up defining "perceptual uniformity" is likely to depend drastically on what test you use to compare the markers. Aside from the technical issues involved in developing a principled metric for perceptual size, there is also the practical, sociological issue of what people's bosses will allow them to use.
I'm currently a grad student, and I know that if I tried to pitch to my boss that I was using some phenomenological scaling for the sizes of the markers in my scatterplots, he would tell me to either find a paper I can cite or revert to what we've always done, which is to use objective, geometric criterion to scale our markers so that they appear perceptually uniform in a way that's easily justifiable.
So even if we do find a good metric for perceptual uniformity and get good data and standardize it and get it implemented in time for 4.*, I think there's a very strong argument to be made that we should also include an option to define markersize to mean something more geometrically/objectively definable, for those of us who have to justify our marker size choices to skeptical/adversarial reviewers.
Conversely, if we don't get this done in time, I think it would be better to change the marker sizes to follow a well-defined, objective geometric criterion (see below) than to leave them as is, since they frankly produce rather ugly plots as is without hand-tuning the marker sizes.
Here are the options I propose including alongside (or instead of) a perceptually-uniform approach:
1) for area based plots (like scatter), have the default be for all markers to have unit area. (including the marker edge or not, although including the edge will require a bit of work for some markers). 2) for line-based plots (like plot), have the default be for all markers to have the same "average distance from center of mass". This is the std deviation of the marker (+/- edge) reinterpreted to be the support of a uniform probability distribution in the plane. This is a "linear" measure of the marker's size which works well IME for all markers I've tried it on.
I am happy to implement a couple of simple GUIs that could be used to do "randomized/single-blind" testing of the relative "sizes" of different markers (i.e. present scatter/line plots with two different markers with randomly chosen sizes included and just ask "which looks bigger? or do they look about the same?"). But ideally these would be distributed for many different people to use and the data collected in some centralized location to be analyzed later. Would posting the code as a gist here be appropriate? Attaching the data to comments on this thread? Or is this conversation more appropriate for the mailing list?
@brunobeltran Firstly, great to see more comments on this issue.
There are two uses of markers. One is to just annotate different data sets as to make them distinct. This was my original concern and most of what you point out does not apply then. The second is the use in a scatter plot, where the size of the marker has meaning. However even then only the relative size of the markers matters. I have never seen someone making an overlay of a scatter plot of circles and a scatter plot of squares and then compare the relative sizes of circles and squares with each other. You only compare circles to circles and squares to squares.
If that is true, then there is no problem in computing the size of a marker from the markersize
keyword with a fudge factor to adjust for perceived size internal_markersize = fudge_factor * markersize
. Things would still scale as they did before, so relative changes in size are not affected by this change.
Good to hear that alternate perspective @HDembinski!
I have actually used multiple glyphs in contexts where comparing their sizes is meaningful. Most often when including e.g. trade volume as a separate channel of information using size. IME, this can look good when there are very few data points, and is useful color is already being used to signal a different, quantitative variable.
Maybe your comment that you've never seen this before is a sign that I should think more carefully when including four quantitative channels of data using scatter's four main inputs (x, y, s, c). However, these types of plots are not uncommon (https://www.researchgate.net/figure/Scatterplot-of-average-error-rates-and-completion-times-for-all-99-configurations-The_fig3_221557219 and https://www.premraj.me/five-dimensional-scatterplot-using-ggplot2/ were both on the first "page-ish" of my images.google.com results for "scatterplot glyphs").
Regardless of whether or not the above is bad practice, people will still end up using different glyphs in a scatterplot sometimes. My original point (before I forgot to stop typing) was that we want these glyphs to "appear" the same size as well, and that the fudge_factor
you reference might change depending what the plot looks like, and even depending on what the actual pair of glyphs being used is.
I guess I'm happy to move forward with building some kind of blind testing GUI to try to get some values for this fudge_factor
, just wanted to point out that we need to be careful in how we design/interpret this testing.
I'll wait for the word of a maintainer on how/if to distribute this test and how/where to tabulate the results before I put any work into it, though.
I am a bit concerned about the risk of spiraling scope on this, much of it outside of the technical skills of many of our developers (I know enough about designing surveys / user research to know I know nothing useful about designing surveys / user research).
Doing this right seems like it is at least a master's thesis worth of work (and I may be underestimating), does anyone in this thread know an academic who would be interested in partnering with us on this? I know that will drastically slow everything down, but if we are going to do this and possibly make a major breaking change to Matplotlib, we should make sure we are on sound footing.
As to if it should be done, I think I am leaning towards yes. We should do some research (either git log / mailing list splunking or talking to people) to sort out if there was any systematic logic behind the current sizes. My suspicion is that there was not (or it was "look like MATLAB"), if there was, we should sort out if that reasoning still holds. In either case, if we do have solid guidance on better marker sizes, we should use it (but "solid guidance" is doing a lot of work in that sentence, I am thinking "published paper" level solid)).
And to be clear @HDembinski and @brunobeltran I am thankful you are both thinking about this, just cautious about making major changes.
Thanks for the input @tacaswell . To be honest I agree that it sounds like a masters degree's worth of work.
Having dug through the code for marker sizes in depth for a recent PR (#16607), I can tell you that the systematic logic for the current sizes appears (at least a posteriori) to be: "whatever was easiest to code". I go into detail for each marker in #16623. I can provide more explanation if need be, but I think that's it's hard to read markers.py
and not immediately come to the same conclusion.
Also, to clarify, I take your comment to mean that if I were to perform this research myself, this thread would not be an appropriate place to share data (or to gather willing participants?)
One could start by allowing users to actually size their markers consistently without needing to make use of private attributes, see https://stackoverflow.com/questions/53227057/size-distortion-when-rotating-custom-path-marker-in-matplotlib and https://stackoverflow.com/questions/49660174/rotate-existing-matplotlib-markers This would already now allow anyone to use their custom "fudge_factor".
@ImportanceOfBeingErnest While I agree that MarkerStyle's behavior in these cases should be changed, I think that maybe a separate Issue should be opened for this, since the posts you link suggest some reasonably complex set of API changes. Do you have any Issues open yet for these problems? If not, just @me in one and I'm happy to take the lead of implementing any API changes that are agreed on!
After all, you can already implement custom fudge_factor
's, by just changing what the markersize
is for each symbol you use....
@tacaswell I reluctantly agree that doing this correctly is a major task, similar in scope to the work that eventually led to the new colormaps. The questions is whether we want to wait for this or whether we should go forward with a less ideal solution, which at least adresses some of the issues.
So as an intermediate step, I propose to change the API so that markersize=value
means equal area for all markers. "The perfect is the enemy of the good." I personally don't want to wait for an undefined time until we improve the status quo.
I agree on the points brought up. To be pragmatic, I would also support @HDembinski idea: given that this will only come with 4.x, why not put this as a planned feature. If a better way can be found before 4.x, we can go for that. Otherwise, the "best" is for 5.x or later and we have something good for sure "soon".
What is the prior art on this? i.e. what do other packages do?
I guess I'm a little leery of this - if you tell me the marker size of a square is 10 pts, I expect it to be 10 pts across. If you tell me a circle has a marker size of 10 pts I expect its diameter to be 10 pts. Obviously their areas are not perceptually equal. Except for the big diamond and maybe the "x" above, everything looks "correct" to me in that plot.
you tell me the marker size of a square is 10 pts, I expect it to be 10 pts across.
I agree that linear width is more intuitive. Scatter sizing is maybe the only place where the opposite is even arguable, and that's because it's what Matlab (and then matplotlib) has always done.
Everything looks "correct" to me in that plot.
But the markers are not all the same width! See #16623 for detailed breakdown. They are slightlyyyyy different widths and very different heights...Something should be consistent. Right now it's neither width, height, nor area.
Agree as well that constant area now (4.x) and better as soon as it's available (even if only by 5.x) seems like best path forward.
@brunobeltran Thanks for the link to #16623. I agree with your proposed path 2 in #16623 and we should make everything touch your unit box in at least two spots ;-)
Right, markersize
should scale the linear width (or height) of the marker bounding box as it does now, but equal markersize
should mean equal filled area for markers. I think it is possible to have both (someone please check):
So I think the first step here is to hack on the MarkerStyle
class to give it a "consistent" mode (based on what ever approach you want). It looks like part of the scaling logic is there, but maybe not correct.
Once we have that we can sort out what the API to control it (on master we can now pass a MarkerStyle
object directly thanks to @ImportanceOfBeingErnest so that is a worst-case failure mode).
Being able to pass the MarkerStyle
alone is not sufficient. You would also need to be able to use unscaled paths in it. I created https://github.com/matplotlib/matplotlib/pull/16773 for what would be the minimal requirement for that.
I'd like to add a point: Maybe we also should expect the centroid marker to be located at the coordinates, which however seems not the case for several markers such as the triangles (if I understand correctly). The plot is generated using the code by brunobeltran in https://github.com/matplotlib/matplotlib/issues/16623
Bug summary
Markers of different types ("o", "s", "*" ...) do not visually appear to be of the same size when their marker size (e.g. ms=8) is equal (matplotlib 3.1.1).
Details
Matplotlib has a great selection of markers, but the relative sizes of these markers are not perceptually uniform, see https://matplotlib.org/3.1.1/api/markers_api.html or this example script:
The square "s" and the diamond "D" appear larger than the other markers. The star "*" is the smallest, followed by the pentagon "p" and the plus "P".
Expected outcome
Markers should appear uniform in size. I think for the star this is very obviously not the case. The area of the star is much smaller than for the circle, and largest for the square and the diamond.
Nevertheless, I don't think an objective geometric criterion like area can be used to make them perceptually uniform in size. I think this needs to be hand-tuned by a human to take into account how humans perceive the relative size of objects.