rzel / google-refine

Automatically exported from code.google.com/p/google-refine
0 stars 0 forks source link

Meta facet #58

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
(Requested separately by Johan Sundström, Will Moffat, and Matt Hampel.)

We want a meta facet which is a numeric range facet on top of a text (list) 
facet, so that we can 
select all facet choices having counts within some range (say, greater than 1).

Numeric-range-on-top-of-text meta facet might not be the only way to go meta. 
We might want a 
scatterfacet n which each axis plots the facet choice count of some text facet.

Original issue reported on code.google.com by dfhu...@gmail.com on 22 May 2010 at 11:40

GoogleCodeExporter commented 9 years ago
Upon further thinking, it seems impossible to support more than one meta facets 
simultaneously. Consider two 
meta facets A and B both of which have selections. In order for A to 
instantiate its RowFilter, it must first compute 
its choice/count pairs. In order to do so, A needs a FilteredRows that 
incorporates the RowFilter by B. And vice 
versa: for B to instantiate its RowFilter it needs the RowFilter by A.

Not all is lost, though. Instead of meta facets, what we can easily support is 
a command that creates a new 
column and fill it in with facet choice counts. Then a numeric range facet can 
be used on that new column. The 
drawback is that those counts don't change dynamically.

Original comment by dfhu...@gmail.com on 23 May 2010 at 5:56

GoogleCodeExporter commented 9 years ago
I tried to do this manually - creating two facets for two different columns (I 
was 
using the movie sample data from the test data folder in the SVN trunk).  It 
seems to 
work, so I'm not sure why we couldn't do it in code?

I created one for performances-actor and another for performances-character.  I 
sorted performances-actor by count and selected all actors with counts above 5 
(Mike 
Myers, Eddie Murphy & Cameron Diaz).  I then sorted characters and selected all 
characters with counts above 3 (Princess Fiona, Shrek, Donkey).  As each 
character 
was selected the actor count was filtered and updated.

Original comment by iainsproat on 23 May 2010 at 7:11

GoogleCodeExporter commented 9 years ago
@iainsproat: you were probably lucky to get a case that works. Actually you 
don't need 2 meta facets to have a 
problem here. Just 1 meta facet and some regular facets are enough to cause 
problems.

Consider a data set with 3 rows and 2 columns

A  C
A  D
B  D

Consider 3 facets

Text facet P on first column
A (2)
B (1)

Text facet Q on second column
C (1)
D (2)

Meta facet M on first column (that is, M is meta with respect to P)
count of 2 (1)
count of 1 (1)

Now, in M, select count of 2:

M
count of 2 (1) -selected
count of 1 (1)

P
A (2)

Q
C (1)
D (1)

If in Q you select C, what do you expect to happen? Ignore M for the moment and 
consider P. Selecting C in Q 
would change P to

P
A (1)

This is because only exactly 1 row has C. But since M selects any row that 
corresponds to a choice in P with 
count 2, now M must select no row at all.

One solution is as I mentioned before to create a column with the facet counts.

Another solution is to make meta facets not affected by other facets 
(nullifying the problem in the example 
above). That is, making selections in other facets shouldn't change a meta 
facet's choices. M only work with the 
choice counts in P when there is no facet selection whatsoever.

The pros here include less data getting stored, fewer clicks to get what you 
want, and thus more interactivity. 
The cons include potential confusion as to how facets in general affect one 
another. I think with the right design 
the confusion can be mitigated. It's probably one of those cases where it would 
work as you expect if you don't 
think too much about it.

I'm leaning toward the second approach. It shouldn't be too hard to implement.

Original comment by dfhu...@gmail.com on 23 May 2010 at 6:05

GoogleCodeExporter commented 9 years ago
Fixed by r848. For any text facet, scroll down to the bottom of its choice 
list. You should see "facet by choice 
counts". Click on that and you'd get a numeric range facet.

Original comment by dfhu...@gmail.com on 24 May 2010 at 7:10

GoogleCodeExporter commented 9 years ago

Original comment by tfmorris on 18 Sep 2012 at 2:56