TSSlade / google-refine

Automatically exported from code.google.com/p/google-refine
Other
0 stars 0 forks source link

Option to exclude some candidates in a cluster from being merged #105

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
In the cell cluster & merge dialog, it should be possible to exclude individual 
candidates from being part of a single cluster.

An example I came across was the following cluster:
"Paris 1, Paris 2, Paris 3, Piraeus, Paris 4....".
If I merge this then Piraeus is erroneously changed to Paris.
I'd like an option to exclude Piraeus from being merged.

Original issue reported on code.google.com by iainsproat on 30 Jun 2010 at 7:18

GoogleCodeExporter commented 8 years ago
hmm, what about just using external recon to do this?

Original comment by stefano.mazzocchi@gmail.com on 1 Jul 2010 at 7:55

GoogleCodeExporter commented 8 years ago
Using external recon loses all the benefits of doing merge & cluster.

If the reconciliation engine doesn't give a high score to the cell value, e.g. 
'Paris 1' is not given a high match as being the city of Paris, I'd have to go 
through each of the Paris cells and ensure it gets reconciled to the correct 
Paris topic.  Not ideal.

Original comment by iainsproat on 1 Jul 2010 at 8:48