Once #150 is implemented and we have a more robust way to upload data into the Datastore, we'll have the necessary metadata (JSON Table Schema) to support the development of Data Enrichment Extensions.
Perhaps, enrichment can be done by either by local services or remote services using webhooks (#122)?
Some obvious data-enrichment services:
Geocoding (with plugins for various geocoding services, perhaps, leveraging geopy). In NYC, we already developed a Geoclient geocoder for geopy. :smile:
tighter OpenRefine integration - i.e. clearer association/presentation of pre-refined dataset, recipe, and refined dataset
Data Enrichment Extensions are a special class of CKAN extensions that operate on Datastore data.
To obviate the need for revisions, I suggest that all enrichment will only be additive - adding columns perhaps with a special prefix (e.g. _e_latitude, _e_longitude, etc.) This is similar to how CartoDB adds columns to data when you import it (i.e. cartodb_id, the_geom).
Having this convention to enriched column names may actually help with linking entities across datasets (e.g. _e_wikidata_id, _e_wikidata_url) as the names are standardized.
Once #150 is implemented and we have a more robust way to upload data into the Datastore, we'll have the necessary metadata (JSON Table Schema) to support the development of Data Enrichment Extensions.
Perhaps, enrichment can be done by either by local services or remote services using webhooks (#122)?
Some obvious data-enrichment services:
Data Enrichment Extensions are a special class of CKAN extensions that operate on Datastore data.
To obviate the need for revisions, I suggest that all enrichment will only be additive - adding columns perhaps with a special prefix (e.g. _e_latitude, _e_longitude, etc.) This is similar to how CartoDB adds columns to data when you import it (i.e. cartodb_id, the_geom).
Having this convention to enriched column names may actually help with linking entities across datasets (e.g. _e_wikidata_id, _e_wikidata_url) as the names are standardized.