Closed bianchi-dy closed 5 years ago
The parallel coordinate plot is a really confusing plot... I will first try to explain why I think it is so confusing and also what I think is the best solution to support it.
There are three main ways to map data to aesthetics. In the examples below I will use numbers to refer to row indices, and letters to refer to column indices.
{ columnA: dataValueA, columnB: dataValueB, columnC: dataValueC }
->
{ aestheticA: aestheticValueA, aestheticB: aestheticValueB, aestheticC: aestheticValueC }
Example:
<vgg-point :x="row.a" :y="row.b" />
{
columnA: [ dataValueA1, dataValueA2, dataValueA3 ],
columnB: [ dataValueB1, dataValueB2, dataValueB3 ]
}
->
{
aestheticA: [ aestheticValueA1, aestheticValueA2, aestheticValueA3 ],
aestheticB: [ aestheticValueB1, aestheticValueB2, aestheticValueB3 ]
}
Example:
<vgg-multi-line
:x="dataframe.a"
:y="dataframe.b"
/>
And then, the category that the parallel coordinate plot falls into:
{ columnA: dataValueA, columnB: dataValueB, columnC: dataValueC }
->
{
aestheticA: ['columnNameA', 'columnNameB', 'columnNameC'],
aestheticB: [dataValueA, dataValueB, dataValueC]
}
So what do we do with this? I was initially thinking of adding a new transformation called map
, which would be like mutate
in the sense that it would calculate a new column. So then you could do something like
<vgg-data
:data="{ a: [1, 2, 3, 4], b: ['apple', 'apple', 'banana', 'banana'], c: [5, 6, 7, 8] }"
:transform="{ map: {
aScaled: { val: row => row.a, scale: { domain: 'a', range: [0, 2] } },
bScaled: { val: row => row.b, scale: { domain: 'b', range: [0, 2] } },
cScaled: { val: row => row.c, scale: { domain: 'c', range: [0, 2] } }
} }"
>
<vgg-section
...
:scale-x="['a', 'b', 'c']"
:scale-y="[0, 2]"
>
<vgg-map v-slot="{ row }">
<vgg-multi-line
:x="['a', 'b', 'c']"
:y="[row.aScaled, row.bScaled, row.cScaled]"
/>
</vgg-map>
</vgg-section>
</vgg-data>
Although I still kind of think the map
transformation is fine to add to the library at some point, I actually like your approach, with the array of scaling options, better. But there are two problems with it. The first is that the explanation above (point 3) shows why the following code
:y="{ val: row.explanatory, scale: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG'] }"
wouldn't be enough. Instead, it would have to be something like
<vgg-multi-line
:x="{
val: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG'],
scale: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']
}"
:y="{
val: [row.Name, row.Price, row.WetWeight, row.RearWheelHorsePower, row.TopSpeed, row.MilesPG],
scale: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']
}"
/>
The second problem is that right now, giving an array directly to scale
is already used to manually specify a domain. So scale: ['Name', 'Price', ...]
means that you are trying to create a single categorical scale (as we are doing in the :x
prop!). But this problem could be solved by simply adding a new option called scales
. So you would get
<vgg-multi-line
:x="{
val: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG'],
scale: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']
}"
:y="{
val: [row.Name, row.Price, row.WetWeight, row.RearWheelHorsePower, row.TopSpeed, row.MilesPG],
scales: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']
}"
/>
Which I think is pretty neat! The only possible objection is that you might not notice the difference between scale
and scales
if you quickread the code. But that might not be a real issue, and otherwise we could also pick something other than scales
.
Positioning the axes would be simple if we would move the scale: ['Name', 'Price' ...]
inside of the :x
prop out to the vgg-section
's :scale-x
prop. Then you could position the axes with
<vgg-section
...
:scale-x="['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']"
>
<vgg-map v-slot="{ row }">
...
</vgg-map>
<vgg-x-axis
v-for="column in ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']"
:x="column"
:w="50"
...
/>
</vgg-section>
Doing that inline might be a little harder, and I am not immediately sure how we would do that. But this would work for now right? Shouldn't be too much work to implement either, just adding some logic to the mappings.js
file I think. Thoughts?
@luucvanderzee I'm fine with the approach for positioning the y-axis along the x-axis using scale-x
in in vgg-section
since then it goes hand in hand with the library's general positioning and scaling logic, but I'm a little unclear on how the inputs to x
and y
get processed in the mark itself, e.g.
<vgg-multi-line
:x="{
val: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG'],
scale: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']
}"
:y="{
val: [row.Name, row.Price, row.WetWeight, row.RearWheelHorsePower, row.TopSpeed, row.MilesPG],
scales: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']
}"
/>
So if we use the ff in vgg-section
:
:scale-x="['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']"
Then I think we'd no longer need scale: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']
in this bit:
:x="{
val: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG'],
scale: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']
}"
Since it already scales to scale-x
. Is that correct?
As for
:y="{
val: [row.Name, row.Price, row.WetWeight, row.RearWheelHorsePower, row.TopSpeed, row.MilesPG],
scales: ['Name', 'Price', 'WetWeight', 'RearWheelHorsepower', 'TopSpeed', 'MilesPG']
}"
then I suppose for the enclosing vgg-section
, there would be no scale-y
. We can change the object key name to sth like scaleOrder
to make it more distinct, etc. How difficult do you think would this be to implement?
@bianchi-dy
About your first point: yes, that is correct! You could decide whether you want to use the Section's :scale-x
prop, or the inline version with { val: ... , scale: ... }
. This is already supported behavior btw.
About the second point: again correct, the Section's :scale-y
prop could not be used for this. As for the name:scaleOrder
is already better than scales
, but idk... maybe we can still brainstorm a bit about it. I don't expect this to be too hard to implement tbh. Do you need this feature urgently?
This has been somewhat resolved in the scale-transformation
branch, but I'll run some tests for other data types to see if we've covered everything.
Resolved in #136
Currently, scales are applied to marks on a 1-to-1 basis for x-y coordinates. For example:
Where only one scale is applied to the entirety of
row.explanatory
.However, for charts such as parallel coordinates and radar charts, each axis has its own scale, and thus each x or y coordinate needs to be scaled according to the dimension of the axes, i.e.
such that a different scale applies to each point in the row. So a row of data might look like:
['apple', 100, 2 'b', 500]
which would then be scaled to coordinates internally (say, given a range of
[0, 10]
):[1, 5, 2, 3, 6]
From what I can understand in this example, Vega applies multiple scales to a single mark by specifying the scales as part of an array:
then it seems a given point's
y
in a data row is scaled according to the scale matching its index in the scales array. One caveat is that they only seem to support scaling for continuous domains at the moment. On the other hand, Vega-Lite requires the data to be transformed (window transform + fold transform appear to be the key transformations here).There's an attempt at Parallel Coordinates living in the
idcGraphs
branch. Based on how much fingaling it took to implement manual scaling per axis, giving the necessary scales as an array and then enumerating the scales to the points seems a reasonable way to approach it in vue-gg, perhaps:The scale for the x-coordinate is related to
y-axis
positioning, so this is scaled to a range of[0, 1]
, such that the x coordinate matches where the axis with the relevant given dimension is. So for the same sample above:x:
[ <axis 1 loc>, <axis 2 loc>, <axis 3 loc>, <axis 4 loc>, <axis 5 loc> ]
scales to[0, 0.2, 0.4, 0.6, 0.8]
– this is scaled to range[0, 1]
, as that is the input forhjust
invgg-x-axis
y:['apple', 100, 2 'b', 500]
could scale to[1, 5, 2, 3, 6]
given a range of[0, 10]
– this is one line in the chart. Each item in[1, 5, 2, 3, 6]
refers to the y-coordinate of the line at a given axis.Implementation-wise, I'm not sure if I'm missing any pros and cons with an enumeration approach or if there are better ways to carry out multiple scales (I'm also still studying Vega's implementation so I'll update this issue if anything interesting comes up). Any thoughts?