ckan / ideas

[DEPRECATED] Use the main CKAN repo Discussions instead:
https://github.com/ckan/ckan/discussions
40 stars 2 forks source link

Data Enrichment Extensions #151

Open jqnatividad opened 9 years ago

jqnatividad commented 9 years ago

Once #150 is implemented and we have a more robust way to upload data into the Datastore, we'll have the necessary metadata (JSON Table Schema) to support the development of Data Enrichment Extensions.

Perhaps, enrichment can be done by either by local services or remote services using webhooks (#122)?

Some obvious data-enrichment services:

Data Enrichment Extensions are a special class of CKAN extensions that operate on Datastore data.

To obviate the need for revisions, I suggest that all enrichment will only be additive - adding columns perhaps with a special prefix (e.g. _e_latitude, _e_longitude, etc.) This is similar to how CartoDB adds columns to data when you import it (i.e. cartodb_id, the_geom).

Having this convention to enriched column names may actually help with linking entities across datasets (e.g. _e_wikidata_id, _e_wikidata_url) as the names are standardized.

jqnatividad commented 5 years ago

For geocoding, we have deployed https://github.com/wardi/ckanext-geocodejob in WPRDC and it works!