This project is Phase 1 of the API import project.
It will result in an API-Configuration file that can be customized for any end-point.
It will reuse bulk upload and import script code.
define API for importing from another CKAN instance
define mappings of fields
define access auth requirements
Import-start function, first version will have a button on the Config page. Eventually we will use s timer to automatically do it.
define end-point fields:
url
architecture (CKAN, Socrata, Dataverse, etc)
auth tokens required
metadata mapping file
catalogue selection criteria (either list of catalogue entries or one config per entry)
auxiliary data from additional point, such as quality metrics
Show Package
Package_show will return a JSON object containing high level information about the data on this page (the data owner, the last refreshed date, associated topics and civic issues, etc). This JSON will also contain high level information for each “resource” on this page. A “resource” is one concrete data thing (like a file, or a database table), and its object this API response will contain information you’ll need to grab its contents.
Accessing Aux Data
To get records from a dataset, like the data quality scoring dataset:
Each record in the “records” sub-object in the response should be a row in the spreadsheet I showed you today. You can match its package_name and resource_name attributes to a package and resource from a package_show call
This project is Phase 1 of the API import project. It will result in an API-Configuration file that can be customized for any end-point. It will reuse bulk upload and import script code.
Start with City of Toronto Open Data Portal: There are a lot of ways to configure these datastore_search calls – more info here: https://docs.ckan.org/en/2.9/maintaining/datastore.html#ckanext.datastore.logic.action.datastore_search
There are lots more API endpoints you can call that are documented here: https://docs.ckan.org/en/2.9/api/
API END POINTS
List packages To get a list of all package names from our CKAN instance: https://ckan0.cf.opendata.inter.prod-toronto.ca/api/3/action/package_list
List Resources
(for reference, https://ckan0.cf.opendata.inter.prod-toronto.ca/api/3/action/ is the base URL for 99% of the API endpoints you’ll hit on our portal)
Show Package Package_show will return a JSON object containing high level information about the data on this page (the data owner, the last refreshed date, associated topics and civic issues, etc). This JSON will also contain high level information for each “resource” on this page. A “resource” is one concrete data thing (like a file, or a database table), and its object this API response will contain information you’ll need to grab its contents.
Accessing Aux Data To get records from a dataset, like the data quality scoring dataset:
get the package metadata: https://ckan0.cf.opendata.inter.prod-toronto.ca/api/3/action/package_show?id=catalogue-quality-scores
get the “id” from the resource names quality-scores-explanation-codes-and-scores and plug it into the below datastore_search API call: https://ckan0.cf.opendata.inter.prod-toronto.ca/api/3/action/datastore_search?id=6d999ad7-d83c-4515-afc7-cae7ea85a1a8
Each record in the “records” sub-object in the response should be a row in the spreadsheet I showed you today. You can match its package_name and resource_name attributes to a package and resource from a package_show call