Open daguar opened 11 years ago
Pinging @dthompson and @spara
Let's implement this new approach here, since some of the logic's already baked in.
Also pinging @bensheldon (who is interested) for great justice.
I'd be happy to take a shot at this as my first project of the summer (GSoC) - I'll be flying up to the office Wednesday morning. Quick questions for @daguar:
Great!
For data sets, I have a few good test cases: A. City restaurant inspections data sets (part of the LIVES data standard push; see: http://foodinspectiondata.us/ ). I'm going to tag @dthompson @danavery @migurski for FYI, since it could support some efforts we're doing to help make this data more integration-friendly. B. Cross-referencing two city data sets from an open data portal where we have a crosswalk (eg, property ID) but where each data set has address fields that differ from one another. This would allow us to benchmark this approach.
In my ideal world, this would be available as both a web app (for less technical users) as well as a CLI tool for more heavy-duty use (in particular where the API calls might become an issue).
Planned approach