ericmckean / google-refine

Automatically exported from code.google.com/p/google-refine
Other
0 stars 0 forks source link

"exclude" check box option on cluster member lists #514

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Due to undesired value(s) in the clustered result, I could not merge the rest 
desired values. Pls see Example below

 mirpurkhas 1 
 mirpurkhas-1
 mirpurkhas-2
 mirpurkhas I
 mir purkas 1

If there is option (check) to exclude line 3 from merge, then it will be great.

Original issue reported on code.google.com by kyawnain...@gmail.com on 28 Dec 2011 at 6:52

GoogleCodeExporter commented 9 years ago
Thanks for the suggestion.  You don't say what clustering algorithm you are 
using, but you can probably work around this by adjusting the clustering 
parameters to exclude the undesirable result.

Original comment by tfmorris on 28 Dec 2011 at 7:01

GoogleCodeExporter commented 9 years ago
@tfmorris
Thanks for your comments.
I use all the algorithms because they produced different clusters. Metaphone3 
and Clonge-Phonetic yeild the most promising clusters. They produce different 
clusters and thus I run one after another. These algorithms do not have 
parameter to adjust.
From these algorithms, some clusters contain right members to merge while other 
cluster(s) from the same algorithm has members mixed with desirable and 
undesirable ones. For the latter case, E.g six right members are not able to be 
merge because of unwanted one member and then I have to go back to text facet 
and edit 5 times. Since I am using multiple text facets as multiple filters to 
edit specific records (E.g group of villages under specific district under 
specific province due to very similarity of village names across the different 
district and provinces). Therefore this kind of editing is a lot time consuming 
compare to merging through clustering.

Original comment by kyawnain...@gmail.com on 28 Dec 2011 at 7:34

GoogleCodeExporter commented 9 years ago
Perhaps an easier way might be to apply your various filters and then STAR 
those rows... then do clustering ?  Use the Flag and Star to hold onto your 
filtered items for further analysis with Refine's other tools such as 
clustering ?

Original comment by thadguidry on 28 Dec 2011 at 8:29