dc-js / dc.js

Multi-Dimensional charting built to work natively with crossfilter rendered with d3.js
Apache License 2.0
7.42k stars 1.81k forks source link

array filter feature in dc.js, how to handle when the .dimensions function takes multiple dimensions? #1866

Open lipingyang-geoai opened 2 years ago

lipingyang-geoai commented 2 years ago

Hello dc.js community, @gordonwoodhull, I have one question about the array filter (https://github.com/crossfilter/crossfilter/wiki/API-Reference#dimension_with_arrays), it is pretty straightforward to use it when the value is one variable/column in the function, "crossfilter.dimension(value [, isArray])", but how should I handle if I would like to use the array filter, when the value is an array of dimensions?

for example, normally we use this when we would like to use array filter feature. var myDim_singleColumn = crossfilter.dimension(function(d) { Return d[“oneColumn_in_csv”] }, true);

when not suing array filter feature, this the following works, var myDim_MultipleColumn = crossfilter.dimension(function(d) { Return d[“oneColumn_in_csv”, “AnotherColumn_in_csv”,]

});

But when I would like to use array filter feature, it throws an error. var myDim_MultipleColumn = crossfilter.dimension(function(d) { Return d[“oneColumn_in_csv”, “AnotherColumn_in_csv”,]

}, true);

Any suggestions? Thank you!

gordonwoodhull commented 2 years ago

JavaScript arrays are one-dimensional. so having a comma inside square brackets will always return the element of the last index.

e.g.

var a = [1,2,3]
a[0,1] // returns 2
a[4,3,7,2] // returns 3

I think I understand your underlying question is how to use a tag/array dimension with sunburst chart. Will try to answer soon!

lipingyang-geoai commented 2 years ago

@gordonwoodhull, Yep, Gordon, you know crossfilter and dc.js so well! I look forward to your suggestions and guide!

gordonwoodhull commented 2 years ago

Hi @lipingyang-geoai!

I got an example working. It is rather artificial because I don't know what your data set looks like, but it uses generated data which looks like

[
  {"paths":[["b","c","a"],["d","d","c"],["a","b","b"],["d","d","c"]],"size":345,"other":"p"},
  {"paths":[["d","a","a"],["c","d","d"],["d","d","d"]],"size":1,"other":"p"},
  {"paths":[["a","d","a"],["d","b","a"],["d","c","d"],["b","a","c"]],"size":77,"other":"n"}
]

I found two ways that seem to work. Here is a jsFiddle demo:

image

https://jsfiddle.net/gordonwoodhull/0d1tumy3/90/

Option A: Return array of arrays from tag dimension.

Option B: Collapse (join) the path into a string, and then split it using a fake group.

Option A is simpler, but I am not 100% sure if it is correct, because it involves some implicit conversion of arrays to strings.

Option B collapsing the paths:

    var picturesDimension = ndx.dimension(function (d) {
        return d.paths.map(p => p.join('/'))
    }, true);

Splitting paths using a fake group:

const splitKeys = group => ({
    all: () => group.all().map(({key, value}) => ({key: key.split('/'), value}))
})
// ...
    fileChart
        // ...
        .group(splitKeys(picturesGroup))

Note that the workaround we use for #1864 makes the colors really bland and repetitive, because it's only using the last part of the path. For this kind of data where the paths are composed of only 'a'-'d', there end up being only four colors used. So there is probably a better fix for that issue.

Apologies in advance if using synthetic / generated data in the above example obscures what is going on. I guess the example shows that these features can work together, but it will still be some work to figure out what is going wrong with your data or code.