datatonic / duke

Automatically exported from code.google.com/p/duke
0 stars 0 forks source link

Consider integration with other open source platforms #80

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
There are others open source tools to do record linkage/deduplication, eg.:

 * FRIL: http://fril.sourceforge.net/ it has a really nice GUI and implements a good set of standard record linkage methods. It allows also for manual review of possible matches.
 * FEBRL: http://datamining.anu.edu.au/software/febrl/febrl-04.html a bit outdated and written in python, but it implements several useful algorithms not found in other projects so it should be considered at least for reference.
 * WEKA: http://www.cs.waikato.ac.nz/ml/weka/ not directly related to record linkage/deduplication, but implements several classification and clustering methods often used in record linkage problems

I think these projects links are useful at least for reference.

Original issue reported on code.google.com by davide.r...@gmail.com on 27 Jun 2012 at 2:55

GoogleCodeExporter commented 9 years ago
I'm not necessarily averse to integrating with these tools, but I'm not sure 
what, exactly, to integrate with. Do you have any specific proposals?

Original comment by lar...@gmail.com on 3 Jul 2012 at 7:17

GoogleCodeExporter commented 9 years ago
Please make a specific proposal, or I'm afraid I'll have to close this issue as 
inconclusive.

Original comment by lar...@gmail.com on 23 Jul 2012 at 8:10

GoogleCodeExporter commented 9 years ago
That was mainly for reference purposes, didn't find a best place for this.
You can safely close the issue.

Original comment by davide.r...@gmail.com on 23 Jul 2012 at 9:02

GoogleCodeExporter commented 9 years ago
Ok, will do.

Original comment by lar...@gmail.com on 23 Jul 2012 at 9:05