Create a dump.to_ckan processor that can take the output of a pipeline and upload it to CKAN, creating datasets and resources as necessary.
The output of a pipeline is primarily a datapackage.json, and secondarily streaming resource files. These will be treated as CKAN datasets, and stored resource files respectively.
1) Initially, we will map the datapackage metadata to a newly created CKAN dataset (using ckans api package_create), and a new resource for each item listed in datapackage['resources'] (using ckan api resource_create).
2) Optionally, resource data can be uploaded to the CKAN filestore.
3) Optionally, resource data can be added to the CKAN datastore.
ckan/ckanext-datapackager has a useful convertion library (currently being upgraded to v1 specs) to map between datapackages and ckan datasets. We propose to extract this into a separate lib that can be imported by both project ('ckan-datapackage-tools', or similar).
Description
Create a
dump.to_ckan
processor that can take the output of a pipeline and upload it to CKAN, creating datasets and resources as necessary.The output of a pipeline is primarily a datapackage.json, and secondarily streaming resource files. These will be treated as CKAN datasets, and stored resource files respectively.
1) Initially, we will map the datapackage metadata to a newly created CKAN dataset (using ckans api
package_create
), and a new resource for each item listed indatapackage['resources']
(using ckan apiresource_create
).2) Optionally, resource data can be uploaded to the CKAN filestore.
3) Optionally, resource data can be added to the CKAN datastore.
ckan/ckanext-datapackager has a useful convertion library (currently being upgraded to v1 specs) to map between datapackages and ckan datasets. We propose to extract this into a separate lib that can be imported by both project ('ckan-datapackage-tools', or similar).
Tasks