Closed jlstevens closed 5 years ago
"Locations"?
Where does Scatter fit into your bullet list above?
Scatter
is a chart. It is the chart equivalent to Points
(which then belongs to the concept we are discussing).
Almost any name is better than the ones I suggested but I don't think Locations
is quite right. I think that Regions
might be slightly closer but that isn't much better.
How about Markers
? I think the idea of a marker is more specific than an annotation (e.g I wouldn't consider text to be a marker). I would suggest ValueMarker
except there doesn't have to be a value associated. Maybe DataMarker
suggests there can be important data associated with the elements but I do realize that in general a prefix of 'data' is pretty useless!
That said, I do quite like the sound of DataMarker
: even if the prefix is useless, it makes it clear that we are trying to refer to a specific concept...
Just to summarize, we will have the regular elements (charts, rasters, chart3d, tables etc) then annotations of which the Path
elements are a subclass. Then the idea of DataMarker
is intermediate between annotations and the regular elements.
"Regions" implies a 2D enclosed area, to me. GraphicalElement? Marker seems ok. DataMarker is ok too, though as you say Data means nearly nothing.
Is this something we still want to consider look at? From my perspective elements can be conceptually grouped as follows
Defined as plot where independent variables map to x-axis and dependent variables to y-axis.
Binned data in 1D or 2D.
2D Gridded data where each coordinate maps to pixel center
Network graphs showing connectivity between different nodes.
Kernel density estimates of 1D and 2D data
Pure annotations useful for highlighting some aspect of a plot.
2D locations where x- and y-axis represent the same quantity/space
Undefined concept == Spatially situated, i.e. treating the 2D plot as a 2D space. SpatiallySituated is not a good name, though.
I'm not sure what your criterion is for separating the Annotations from this category; how are they different? Text, HLine, VLine, and Arrow are all situated in the 2D space of the plot, aren't they?
Also, are the x and y axes really required to be commensurate? Seems like Points, Bounds, and Box at least are well defined whether or not x has the same scale as y, making them a special subset of this concept.
I agree with all that, my criteria for separating annotations was simply that Contours/Polygons/Points/Path are generally not merely annotations but actual data.
Also, are the x and y axes really required to be commensurate?
Not really, that's just how I think about these elements. I think generally when they are not commensurate you're probably using the element as an annotation rather than to represent actual data (but of course there will be exceptions).
I think generally when they are not commensurate you're probably using the element as an annotation rather than to represent actual data (but of course there will be exceptions).
That's probably true of Bounds and Box, but Points seems useful for representing locations in any 2D space. E.g. a plot of market capitalization vs. number of employees, with color representing something else, e.g. which stock exchange the company is traded on --- seems very much a Points plot rather than Scatter (as market capitalization isn't a function of number of employees, or vice versa). But also not an annotation, just a plot in a 2D space where x and y aren't commensurate.
seems very much a Points plot rather than Scatter (as market capitalization isn't a function of number of employees, or vice versa)
I'd argue you should use Scatter
for this. You're never sure if one is a variable of the other but Scatter let's you ask the question "do the numbers of employees have a relationship to the market cap or vice versa?".
A matter of preference, I guess; if I'm trying to see the pattern of colored dots, I don't want to have to pick one or the other dimension as being nominally the independent one; they are both independent to me...
I think the crux of it is figuring out cause and effect - you often don't know about this relationship and two quantities may have nothing to do with each other. This makes it hard to know which should be the kdim and which should be the vdim when considering two dimensions of different types.
When two dimensions have the same type, you can have rotations in that space as it is 'uniform' i.e you get to choose your basis. In this case, it makes sense for there to be two kdims.
Thinking about it this way, in the ideal case would be that you use Points
for two dimensions of the same type (where you can rotate basis) and you use Scatter
otherwise, as you know which dimension is the kdim and which is the vdim.
The problem then is deciding on this relation which is not obvious for uncorrelated, independent quantities, though the question then is why would you want to plot scatters for uncorrelated quantities? I suppose you might be simply searching for correlations...
So on balance I think I do agree with Philipp's assessment.
I don't want to have to pick one or the other dimension as being nominally the independent one...
Then the way to think about it is by picking a kdim and a vdim for your Scatter
, you are making a hypothesis about a relationship between two quantities. That hypothesis that there is a meaningful relationship to visualize may be false of course...
If I'm mainly interested in what I'm plotting as color or size, I'm not necessarily even interested in making a hypothesis about how the x and y dimensions relate to each other; I'm interested in how they relate to the z dimension(s). By using Points for such a plot I want to indicate that I am making no such hypothesis.
I agree, that is exactly where Points
is useful because the x
and y
dimensions are interchangeable i.e you can choose any reference frame. For instance, an image of the night sky has no 'correct' reference frame, if you rotate the image by an arbitrary angle, you are still looking at the same thing as the chosen reference frame is just convention.
I suspect you might be thinking of the case when x
and y
are different dimension types, in which case I think Scatter
is more appropriate than Points
:Scatter
is for plotting one thing against the other.
The concept and baseclass they share is now called Geometry.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Explaining the difference between
Points
andScatter
has always been tricky. Partially, this is becauseScatter
is inchart.py
when it is conceptually more similar toContours
inpath.py
. I think both these classes are part of some difficult-to-name concept and therefore both in the wrong file with the wrong base class.Essentially, here is how I think of it:
Charts
andRasters
: Wrappers around dimensioned data.Annotations
: Supplementary information designed to be drawn on top of (overlayed) over axes of any dimension.Paths
: A type of annotation that involve drawing various shaped lines on top of your plot.ConceptX
: The thing thatPoints
,Contours
andPolygons
are. Essentially, these are half-way between annotations and data. Without value dimensions they are annotations but with value dimensions, they are a sort of annotated data. I can't come up with a name!The point of elements in
ConceptX
is that they convey information by their visual appearance (the positions of the points, contours or polygons) but may also have some value associated with each of these visual elements (e.g the iso-levels of contours). The best (terrible!) semantic name I can come up withOptionallyValuedVisualElements
.This links to issue #102 discussing potential improvements to
Contours
.