dc-js / dc.js

Multi-Dimensional charting built to work natively with crossfilter rendered with d3.js
Apache License 2.0
7.42k stars 1.81k forks source link

Should bubbleChart have a colorValueRetriever? #12

Closed syrincella closed 12 years ago

syrincella commented 12 years ago

Hi Nick,

I'm really enjoying playing with your dc.js package here. I've started right off on a bubbleChart, and I've run into this issue: I can define a color scale, but it's not clear to me how I can bind it to a dimension other than the one that is bound to the entire chart by the .dimension call.

I'm looking at your example, and I can't really glean from that (maybe I should just build it locally and look) by what domain the color range varies across. I can't detect a transition by years, or by percent gain/loss, or by point gain/loss. All colors seem to fill bubbles clustered near [0,0] on the graph.

I have four dimensions I want to display. X & Y obviously, then another that drives the size of the bubbles, and then a fourth that is a kind of heatmap on the bubbles. So I want to define a hot-to-cold color range (doable) and then apply it to a specific dimension somehow. Ways I could think of would either be to add some .colorDimension call or to define a .colorValueRetriever on the bubbleChart function and then stash associated color with a point during the reduceAdd.

I was thinking of exploring this idea in my own fork, but I wanted to run the idea past you as I'm very fresh to your code base (and to d3 (and to crossfilter (and to javascript!))) and could easily be missing something. If you think I am misguided, I'd love a pointer. If you think I'm on to something, then I'd love to hear that too and maybe you'll see a pull request soon.

thanks - Doug

NickQiZhu commented 12 years ago

Hi Doug,

I am glad you found dc.js useful. I have created this library mainly based on my own project requirement so some of the chart might not offer exactly what you expect. Currently bubble chart actually only works on one dimension with a multi-value group/reduce. It allows you to retrieve 3 different values for x/y/radius respectively. It does not allow you bind your data to the coloring of the bubble yet (which is an interesting idea). You can take a look at the source code of my Nasdaq example page to see how a multi-value group/reduce in crossfilter can be used to render a bubble chart.

Create crossfilter dimension and a multi-value group/reduce:

var yearlyDimension = ndx.dimension(function(d) {
                return d3.time.year(d.dd);
            });
            var yearlyPerformanceGroup = yearlyDimension.group().reduce(
                    //add
                    function(p, v) {
                        ++p.count;
                        p.absGain += +v.close - +v.open;
                        p.fluctuation += Math.abs(+v.close - +v.open);
                        p.sumIndex += (+v.open + +v.close) / 2;
                        p.avgIndex = p.sumIndex / p.count;
                        p.percentageGain = (p.absGain / p.avgIndex) * 100;
                        p.fluctuationPercentage = (p.fluctuation / p.avgIndex) * 100;
                        return p;
                    },
                    //remove
                    function(p, v) {
                        --p.count;
                        p.absGain -= +v.close - +v.open;
                        p.fluctuation -= Math.abs(+v.close - +v.open);
                        p.sumIndex -= (+v.open + +v.close) / 2;
                        p.avgIndex = p.sumIndex / p.count;
                        p.percentageGain = (p.absGain / p.avgIndex) * 100;
                        p.fluctuationPercentage = (p.fluctuation / p.avgIndex) * 100;
                        return p;
                    },
                    //init
                    function() {
                        return {count:0, absGain:0, fluctuation:0, fluctuationPercentage:0, sumIndex:0, avgIndex:0, percentageGain:0};
                    }
            );

Char definition:

yearlyBubbleChart.width(990)
                    .height(250)
                    .dimension(yearlyDimension)
                    .group(yearlyPerformanceGroup)
                    .colors(d3.scale.category20c())
                    .keyRetriever(function(p) {
                        return p.value.absGain;
                    })
                    .valueRetriever(function(p) {
                        return p.value.percentageGain;
                    })
                    .radiusValueRetriever(function(p) {
                        return p.value.fluctuationPercentage;
                    })
                    .x(d3.scale.linear().domain([-2500, 2500]))
                    .y(d3.scale.linear().domain([-100, 100]))
                    .r(d3.scale.linear().domain([0, 4000]))
                    .renderLabel(true)
                    .renderTitle(true)
                    .label(function(p) {
                        return p.key.getFullYear();
                    })
                    .title(function(p) {
                        return p.key.getFullYear()
                                + "\n"
                                + "Index Gain: " + numberFormat(p.value.absGain) + "\n"
                                + "Index Gain in Percentage: " + numberFormat(p.value.percentageGain) + "%\n"
                                + "Fluctuation / Index Ratio: " + numberFormat(p.value.fluctuationPercentage) + "%";
                    })
                    .yAxis().tickFormat(function(v) {
                        return v + "%";
                    });

Also check out crossfilter's API for more details: https://github.com/square/crossfilter/wiki/API-Reference

A side note, typically a bubble chart allow you to encode 3 different information using x/y/radius; adding a fourth dimension, in my personal opinion, might be a little bit too tricky for your user to decipher. In that case I would have probably just created a separate chart (a bar chart or line chart) to highlight this data distribution (which is what dc.js designed for multi-dimensional representation). I am also currently working on a concept called renderlet - a sort of hook/callback that you can register in any chart to inject your own custom logic to render anything you want using raw d3 api. This capability will be ready in next release (v0.7). Once that is in place you should be able to set the color for each of the bubble based on your own calculation.

Cheers,

Nick

syrincella commented 12 years ago

Hi Nick,

Thanks for that response. In my case I'm holding the y-dimension constant and by my product owner's spec I am trying to map one dimension to color, another to size (radius) and another to a categorical (ordinal) scale along x - basically the result is to show a series of clusters discovered in data (x), their relative representation of the population (r) and (ideally) a cluster attribute that indicates the degree of "separation" (the important thing) between this cluster and the entire population. So still within your recommended max of three dimensions.

It was hoped to represent this separation dimension with a color range that was like a kind of heat map. And in fact mapping this to y would convey the same information, so there is a way out. But it does seem a little odd allow a color range to be specified against your graph without the natural next step of mapping it to the variation within a dimension. So yeah, I +1 that :-)

In your example, it might be interesting to try to add a dimension that represented major industries and then show that as color. Blues for tech, yellows for commodities, reds for financial, etc. I agree that with your range of years, that could get cluttered pretty quickly, perhaps it would make more sense over just a couple of years. Anyway, a hypothetical.

I did read a lot of your code including the example below, and in fact I tribute my deeper understanding of crossfilter to your examples. And being new to javascript, I found your code organization very illustrative of what appear to me some really clean idioms. Very helpful.

As long as I have this comment open - I'd also like to advocate allowing order to be defined on your group. I attempted, with no success, and when I read bubble-chart.js I saw why. Specifically

    var bubbleG = _chart.g().selectAll("g." + NODE_CLASS)
        .data(_chart.group().all());

Which according to the API docs "Returns the array of all groups, in ascending natural order by key." If instead you did something like

    var bubbleG = _chart.g().selectAll("g." + NODE_CLASS)
        .data(_chart.group().top(Infinity));

You'd allow for grouping order to be added to the chart. This specifically would have helped me because while I wanted the X-axis to draw my ordinal names, I wanted to order by the relative size. Anyway the docs acknowledge that this is slower than .all(), but still, seems like a cost-benefit tradeoff to be made by the framework client.

Hope this is helpful.

Doug

On Sat, Aug 4, 2012 at 7:59 PM, Nick Zhu < reply@reply.github.com

wrote:

Hi Doug,

I am glad you found dc.js useful. I have created this library mainly based on my own project requirement so some of the chart might not offer exactly what you expect. Currently bubble chart actually only works on one dimension with a multi-value group/reduce. It allows you to retrieve 3 different values for x/y/radius respectively. It does not allow you bind your data to the coloring of the bubble yet (which is an interesting idea). You can take a look at the source code of my Nasdaq example page to see how a multi-value group/reduce in crossfilter can be used to render a bubble chart.

Create crossfilter dimension and a multi-value group/reduce:

var yearlyDimension = ndx.dimension(function(d) {
                return d3.time.year(d.dd);
            });
            var yearlyPerformanceGroup = yearlyDimension.group().reduce(
                    //add
                    function(p, v) {
                        ++p.count;
                        p.absGain += +v.close - +v.open;
                        p.fluctuation += Math.abs(+v.close - +v.open);
                        p.sumIndex += (+v.open + +v.close) / 2;
                        p.avgIndex = p.sumIndex / p.count;
                        p.percentageGain = (p.absGain / p.avgIndex) * 100;
                        p.fluctuationPercentage = (p.fluctuation /
p.avgIndex) * 100;
                        return p;
                    },
                    //remove
                    function(p, v) {
                        --p.count;
                        p.absGain -= +v.close - +v.open;
                        p.fluctuation -= Math.abs(+v.close - +v.open);
                        p.sumIndex -= (+v.open + +v.close) / 2;
                        p.avgIndex = p.sumIndex / p.count;
                        p.percentageGain = (p.absGain / p.avgIndex) * 100;
                        p.fluctuationPercentage = (p.fluctuation /
p.avgIndex) * 100;
                        return p;
                    },
                    //init
                    function() {
                        return {count:0, absGain:0, fluctuation:0,
fluctuationPercentage:0, sumIndex:0, avgIndex:0, percentageGain:0};
                    }
            );

Char definition:

yearlyBubbleChart.width(990)
                    .height(250)
                    .dimension(yearlyDimension)
                    .group(yearlyPerformanceGroup)
                    .colors(d3.scale.category20c())
                    .keyRetriever(function(p) {
                        return p.value.absGain;
                    })
                    .valueRetriever(function(p) {
                        return p.value.percentageGain;
                    })
                    .radiusValueRetriever(function(p) {
                        return p.value.fluctuationPercentage;
                    })
                    .x(d3.scale.linear().domain([-2500, 2500]))
                    .y(d3.scale.linear().domain([-100, 100]))
                    .r(d3.scale.linear().domain([0, 4000]))
                    .renderLabel(true)
                    .renderTitle(true)
                    .label(function(p) {
                        return p.key.getFullYear();
                    })
                    .title(function(p) {
                        return p.key.getFullYear()
                                + "\n"
                                + "Index Gain: " +
numberFormat(p.value.absGain) + "\n"
                                + "Index Gain in Percentage: " +
numberFormat(p.value.percentageGain) + "%\n"
                                + "Fluctuation / Index Ratio: " +
numberFormat(p.value.fluctuationPercentage) + "%";
                    })
                    .yAxis().tickFormat(function(v) {
                        return v + "%";
                    });

Also check out crossfilter's API for more details: https://github.com/square/crossfilter/wiki/API-Reference

A side note, typically a bubble chart allow you to encode 3 different information using x/y/radius; adding a fourth dimension, in my personal opinion, might be a little bit too tricky for your user to decipher. In that case I would have probably just created a separate chart (a bar chart or line chart) to highlight this data distribution (which is what dc.js designed for multi-dimensional representation). I am also currently working on a concept called renderlet - a sort of hook/callback that you can register in any chart to inject your own custom logic to render anything you want using raw d3 api. This capability will be ready in next release (v0.7). Once that is in place you should be able to set the color for each of the bubble based on your own calculation.

Cheers,

Nick


Reply to this email directly or view it on GitHub: https://github.com/NickQiZhu/dc.js/issues/12#issuecomment-7507303

NickQiZhu commented 12 years ago

Hi Doug,

Thanks for the detailed explanation of your use case. I am always interested to know how people are using the library. The custom rendering hook being introduced in v0.7 release will allow you to do pretty much everything you want to do. And I am planning to introduce some new API to allow data visualization using color scaling. There is already another issue opened on sorting generally, and I am hoping to tackle it in 0.8 or 0.9 release.

Based on your description, the chart you are creating for business is quite different from a traditional bubble chart, so the generic bubble chart implementation might not suite your need with limited customization though it's API. Another option is to fork my implementation and customize it completely from the ground up or just use it as a reference implementation to create your own. Heat Map is also something that is on my todo list (probably scheduled for v0.8 release)

Cheers,

Nick

NickQiZhu commented 12 years ago

In 0.8 release, now you can use color as an additional data dimension, see example below:

yearlyBubbleChart.colors(["red", "#ccc","steelblue","green"])
                    .colorDomain([-1750, 1644])
                    .colorAccessor(function(d){return d.value.absGain;})
...
syrincella commented 12 years ago

Hey Nick,

I may check it out, but I took your suggestion to start coding directly to d3. I had already had an inkling that that's what I needed to do to properly learn that framework. Unfortunately the fine line to walk with these frameworks on top of frameworks is that the one on the top, while simplifying a very specific use case as you've pointed out, can hide a lot of the power you'd otherwise like to access. Like this color dimension, until you just added it.

I appreciate that you're OK with me reading your code for implementation ideas. I really appreciate it more as much for what appear to me really crisp examples of sophisticated javascript architectures as for learning how to wrangle D3 and Crossfilter. I'm slowly wrapping my head around it.

I have a question for you that's not germane to this thread, if you don't mind I may send you an InMail on LinkedIn - I don't know of another way to do that without you exposing your email address in this public forum - hope that's OK.

Doug

On Thu, Aug 16, 2012 at 2:02 PM, Nick Zhu notifications@github.com wrote:

In 0.8 release, now you can use color as an additional data dimension, see example below:

yearlyBubbleChart.colors(["red", "#ccc","steelblue","green"]) .colorDomain([-1750, 1644]) .colorAccessor(function(d){return d.value.absGain;})...

— Reply to this email directly or view it on GitHubhttps://github.com/NickQiZhu/dc.js/issues/12#issuecomment-7799367.

NickQiZhu commented 12 years ago

Yeah, apparently color encoding is more popular than I originally thought; got requests from other users as well =) Building your visualization from scratch is absolutely the best way to learn d3. Feel free to send me a message on linkedin.