Open rufuspollock opened 11 years ago
I'm building ScraperWikiX at https://github.com/rossjones/ScraperWikiX/
Open Refine recipes/vignettes, especially for "standardised" data formats? eg http://schoolofdata.org/2013/07/26/using-openrefine-to-clean-multiple-documents-in-the-same-way/
Apache OODT? http://oodt.apache.org/ Check out DRAT (Distributed Release Audit Tool) as an example of OODT ETL in action: http://github.com/chrismattmann/drat.git