ddavisqa / google-refine

Automatically exported from code.google.com/p/google-refine
0 stars 0 forks source link

Improving Reconciliation Best Candidate Score #339

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I plan on running a saved JSON history (attached) on a large amount of data 
over the next couple of months. A key part of this project is a reconciliation 
against the Freebase service for 2 of the columns ("Artist or Director" and 
"Title").

The issue is my confidence level with some of the results I'm getting back and 
the possibility of missing an obvious match.

The test project I've been working with is attached and wanted to see if there 
was a better way to run the reconciliation, maybe not using the columns I'm 
including for relevance or tweaking something else.

Data example are contained in rows 667, 668 and 669. The artist is "Doggy's 
Angels" with a pretty good match. The title "Baby If You're Ready" doesn't have 
any good hits. But if you click on "Search for Match" it does find the exact 
title. Why wouldn't it display this in the first place?

Is this just something I need to accept or is there something you see I could 
adjust to get better results.

I truly appreciate your time.

Vinny

Original issue reported on code.google.com by vinnygof...@gmail.com on 23 Feb 2011 at 9:19

Attachments:

GoogleCodeExporter commented 8 years ago
Please let me know if additional information is needed.

Thanks.

Original comment by vinnygof...@gmail.com on 1 Mar 2011 at 2:57

GoogleCodeExporter commented 8 years ago
This sounds like a known problem with the Freebase reconciliation service using 
a stale index.  Unfortunately, it's something which is out of Refine's control. 
 Google is in the process of re-implementing the reconciliation service, but 
until that's complete we just have to live with the stale index.

Original comment by tfmorris on 8 Oct 2011 at 7:20