billmill20 / google-refine

Automatically exported from code.google.com/p/google-refine
Other
0 stars 0 forks source link

Reconcile is not picking up alias hints or even type hints correctly #156

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Import attached project of saint names.
2. Reconcile Column 1 - Name against type /base/saints/saint with no other 
included columns and uncheck Auto-match candidates with high confidence.
3. After reconcile has finished, notice that many names (rows 1 & 2 
specifically) should be at least 60% matchable against the /saint type and that 
Freebase has exact aliases for the correct topics.  St. Gregory the Great
4. Suggest (Search API) seems to do better on it's own with the strings for 
some reason.

What is the expected output? What do you see instead?

No matches during reconcile, when it should return several as Suggest seems to 
do when you click on "search for match" in column to manually find a good 
candidate.

Analyze with Colin or Andi & Faye to find out where or why.

(and for those curious, NO, I don't give a rats arse about religion, just a 
good reconcile example)

Original issue reported on code.google.com by thadguidry on 13 Oct 2010 at 3:27

Attachments:

GoogleCodeExporter commented 8 years ago
This actually has nothing to do with aliases, I believe. The problem is that 
Refine tries to batch together several cells when it calls the recon service. 
And that can overload relevance or acre somehow. I'm lowering the batch size 
and hope that helps (r1470). Otherwise, filter for rows that haven't been 
matched, and re-reconcile them again.

Original comment by dfhu...@google.com on 13 Oct 2010 at 5:05

GoogleCodeExporter commented 8 years ago

Original comment by iainsproat on 14 Oct 2010 at 4:58

GoogleCodeExporter commented 8 years ago
Lowering the batch size seems to have helped with the attached project test 
file of with /saints.

However, I'm now noticing that even though I uncheck the "Auto-match candidates 
with high confidence", Refine still auto-matches on some.  I would have thought 
that unchecking the box would explicitly say not to "auto-match at all, and I 
want to see all my candidates and I'll pick the matches I want".  Is that how 
it should work ? If not, can we get a good description of what the intended 
function is supposed to be ?  Thanks for working with me on this David & team, 
as always! 

Original comment by thadguidry on 15 Oct 2010 at 3:10

GoogleCodeExporter commented 8 years ago
crap. that last comment came out wrong... rather...
I would have thought that unchecking the box would explicitly say "DON'T 
Auto-match anything, just show me the candidates and I will pick the matches 
that I want".

Original comment by thadguidry on 15 Oct 2010 at 3:12