att / rcloud.dcplot

Dimensional charting (dc.js) for RCloud
6 stars 7 forks source link

wdcplot reduce - percentage? #8

Open gordonwoodhull opened 9 years ago

gordonwoodhull commented 9 years ago

From @rhknag on October 10, 2014 0:7

Is there a way to get a percentage out of wdcplot? We are starting do deal with samples of large data but wdcplot seems to favor having all the data. Can the reduce "any" function be used? I couldn't find examples. I learned from Dave Kapilow that dc.js also does not have the logic to do a percent calculation - is that right?

Copied from original issue: att/rcloud#901

gordonwoodhull commented 9 years ago

dc.js and crossfilter can do any kind of aggregation, but it does expect to have all the data - I think you'd have to use something external for sampling.

I assume you mean returning a percentage of the total for each bin, instead of its straight sum. dcplot.js and wdcplot.js do not have percent calculation built in. It's really external to what crossfilter does - you would still need to do a regular sum in the group reductions and then calculate the percentages afterward. This is because crossfilter expects to be able to calculate reductions incrementally, and a percentage depends on the values of the other bins so it doesn't really fit that model.

any just grabs any value - it's intended for reductions where you expect all the records to have the same value. I don't think that helps here.

Percentage could be supported using a group wrapper (a.k.a. fake group). Or we could offer a way to get at the vector of all bins, or the sum of the bins, from R.