square / crossfilter

Fast n-dimensional filtering and grouping of records.
https://square.github.com/crossfilter/
Other
6.22k stars 1.31k forks source link

PivotReduceCount #141

Closed agrass closed 8 years ago

agrass commented 9 years ago

I created a function similar to reduceCount, but use one or multiple keys to reduce only one time with a hash to check if it's already reduced. For example if I have an 1-n relation (one event (id) that has many places and persons)

{id: 1, date: "2011-11-14T16:17:54Z", place: 2, person: 1}, {id: 1, date: "2011-11-14T16:17:54Z", place: 2, person: 2}, {id: 1, date: "2011-11-14T16:17:54Z", place: 3, person: 5}, {id: 2, date: "2011-11-14T16:28:54Z", place: 1, person: 3}, {id: 2, date: "2011-11-14T16:28:54Z", place: 2, person: 4}, {id: 3, date: "2011-11-14T16:48:46Z", place: 2, person: 1}

I have one Event model that has many places and also can have many people associated.

I searched in other issues but I didn't found a solution so I created a method in my own fork of crossfilter that help me to do that, and may be would be useful for another person with this problem. (there are many people using dc.js that would have this same problem)

The method is similar like reduceCount, but is: pivotReduceCount(["key1", "key"]).

For example using the data of above, if I want to get count of places or events in an specific date:

1) First you set the dimension (date) var dimension = data.dimension(function(d) { return d.date; }); var group = dimension.group();

2) then you can reduce by date and place (to count the places in a specific date) var test = group.pivotReduceCount(["date", "place"]);

For example the output of test.all(); would be the number of different places of each date: [ { key: "2011-11-14T16:17:54Z", value: 2 } , { key: "2011-11-14T16:28:54Z", value: 2 }, { key: "2011-11-14T16:48:46Z", value: 1 } ]

To compare, the output of the crossfilter method group.reduceCount would be: [ { key: "2011-11-14T16:17:54Z", value: 3 } , { key: "2011-11-14T16:28:54Z", value: 2 }, { key: "2011-11-14T16:48:46Z", value: 1 } ]

sdnetwork commented 9 years ago

really good, do you think it is possbile to have pivotReduceSum or pivotReduce with custom function for the reduce ?

agrass commented 9 years ago

Yes, I'm working on that. I already pushed a pivotReduce(keys, add, remove, init) method. It's not 100% tested yet, but this week I'm going to push that methods more officially.

esjewett commented 9 years ago

It's not quite clear to me what you want the output of this function to be, but I do wonder if this is possible as a set of custom reduce functions rather than as a modification to Crossfilter itself. From your pull request, it looks like it should definitely be possible.

agrass commented 9 years ago

I added the method pivotReduce, and in the description an example of output compared with a normal reduceCount

esjewett commented 9 years ago

I don't see the description. I wonder - does it produce similar results to this? https://github.com/esjewett/reductio#aggregations-standard-aggregations-exception-aggregation

agrass commented 9 years ago

The example is in the first comment. The method you show me it's very similar, but with this one you can select the keys that make the object uniq. In some cases I needed to use two or more keys.

esjewett commented 9 years ago

Scrolling through the changes in the pull request, I don't see any comment. Did you push it?

That said, if you just need support for multiple keys, you could have the Reductio exception accessor return a string concatenating the keys using a separator you know doesn't appear in either of them. I'll look into directly supporting the multiple key scenario though, as it is a useful one.

agrass commented 9 years ago

Yes, it looks that concatenating the keys with the exception accessor you get the same value of pivotReduceCount (I didn't knew the existence of your method when I created it). The other method, pivotReduce(keys, add, init, remove) is the same but you can use a custom method that 'll be call only if the keys you selected were not already reduced. My bad, the example is in the first comment of the pull request.

esjewett commented 9 years ago

I see, interesting and useful. I'll think about adding that capability of arbitrary reduce methods to Reductio's exception aggregation capability.

RandomEtc commented 8 years ago

Thanks for your contributions and sorry for silence on this side. As discussed in #151 an active fork is being developed in a new Crossfilter Organization. Please consider rebasing and opening your PR there (if you haven't already) where it should be warmly welcomed by the new maintainers. Cheers!