frictionlessdata / datapackage-pipelines-ckan

Data Package Pipelines processors for CKAN
MIT License
1 stars 1 forks source link

Processor to dump datapackage and resources to CKAN #6

Closed brew closed 6 years ago

brew commented 6 years ago

Description

Create a dump.to_ckan processor that can take the output of a pipeline and upload it to CKAN, creating datasets and resources as necessary.

The output of a pipeline is primarily a datapackage.json, and secondarily streaming resource files. These will be treated as CKAN datasets, and stored resource files respectively.

1) Initially, we will map the datapackage metadata to a newly created CKAN dataset (using ckans api package_create), and a new resource for each item listed in datapackage['resources'] (using ckan api resource_create).

2) Optionally, resource data can be uploaded to the CKAN filestore.

3) Optionally, resource data can be added to the CKAN datastore.

  run: ckan.dump.to_ckan
  parameters:
    ckan-host: http://demo.ckan.org
    ckan-api-key: env:CKAN_API_KEY
    stream_resources_to_datastore: false
    stream_resources_to_filestore: false
    dataset-properties:
        name: test-dataset-01
        owner_org: my-organization

ckan/ckanext-datapackager has a useful convertion library (currently being upgraded to v1 specs) to map between datapackages and ckan datasets. We propose to extract this into a separate lib that can be imported by both project ('ckan-datapackage-tools', or similar).

Tasks