Matthew-Weber / ReutersCharter

0 stars 0 forks source link

Data transformations siltently fail on categorical scales #1

Closed basilesimon closed 5 years ago

basilesimon commented 5 years ago

As per chartblock.js, it is allowed to pass in data transformations to categorical scales, which can lead to a confusion as it silently errors, failing to produce x1,y1 coordinates other than NaN.

percentChange is calculated anyway, so I'm not sure where we could try something.

A potential first step to this issue would be to catch the problem and at least console.log() something helpful to help the user?

Matthew-Weber commented 5 years ago

Have updated to console.log() DATA TRANSFORMATIONS ARE NOT AVAILABLE FOR CATEGORY CHARTS. when user attempts to do this. Charter will now chart an UNTRANSFORMED chart (original values). Is this ok? Other options are to:

  1. Still have it fail.
  2. Allow data transformations on categories. I had initially resisted this, cause it seems weird to me, calculating percent changes between categories seems unintuitive as they are not sequential sets, but are discrete items. But maybe I'm wrong here. What does anyone else think?
basilesimon commented 5 years ago

That sounds great @Matthew-Weber.

So, about the use case we talked about earlier – from a csv like this one:

category,tomato,banana
pre-crisis,1,3
current,10,4

I imagined the LineChart would be able to cope, but if a workaround could be to change the category to, say, dates?

date,tomato,banana
2016-01-01,1,3
2019-01-01,10,4

then re-assign in the config options much like I would do for the columns

multiDataColumns: {"tomato": "Tomatoes", "banana": "Bananas"},

Would that be correct?

Matthew-Weber commented 5 years ago

Yes! it will absolutely work with dates, two things to note here though.

  1. the date format there is not what the charter will expect, so you will expressly need to tell it the format of your dates. dateParse:d3.timeParse("%Y-%d-%m"),

  2. multiDataColumns is not what you want. That is to activate buttons that switch between different data sets. what you do want is columnNames

columnNames: {"tomato": "Tomatoes", "banana": "Bananas"},

Matthew-Weber commented 5 years ago

Again though, you make a compelling case with your data set for why one would theoretically want to do a percentage change between categories (from pre-crisis to post-crisis). When I was building this, I was thinking of categories like "Dogs" and "Cats" and had figured, why would you want to do a percentage change between those? So this still begs the question, should I allow that behavior to happen? or does it open the door to people doing transformations on data that should NOT be transformed?

basilesimon commented 5 years ago

I think my use of strings to describe dates (pre-crisis, current price) is confusing us here.

I now get why you restricted percentChange and data transformations in general to linear or time scales. I wonder if the feature we're actually after wouldn't be the ability to name rows, eg :

image

Matthew-Weber commented 5 years ago

Yes. There is a functionality to do this, but perhaps is too clunky or unintuitive.

xTickFormat (d,i,nodes) {return d},

in the block is a function passed into the tick generator. d is the value of the tick, i is the index of the tick, nodes is the array of all the text items in the Dom.

could write a bespoke function to test on tick value or index in the array and return bespoke text.

if (i == 0){return "pre"}
   if (i == nodes.length - 1){return "current price"}
return ""

But maybe there is a better more intuitive solution for this scenario. thoughts?

basilesimon commented 5 years ago

Even if we introduced more control over transformations (and I'm not saying we should), eg

transformations: {
  type: 'percentChange',
  fromColumn: 'pre-crisis',
  toColumn: 'current'
}

We would still need to catch/prevent the issue of transformations being disallowed on categorical scales. So as far as this issue is concerned we're pretty much done here, I should think :)