CKANRestClient.get_datasets() method: some sites doesn’t have query parameter include_private on /api/action/package_search API endpoint, so we need logic to remove it in case of failure. Also we are getting not all datasets as of rows argument restriction. We are getting only 10, maximum is 1000, so custom pagination implementation is needed.
CKANRestClient.get_resource_fields() method: CKAN API endpoint /api/action/datastore_info is available only via extension. Assumption: in case site does not have the extension we were getting 400 Bad Request and adapter logic parsed it into str object, so is_response_successful() was crushing during resp.get("success") as response was type = str. Logic should be fixed.
Adapter config was built with an assumption that CKAN API would be available after the host name, hence there can be a custom endpoint in between. In this case we have no possibility to pull the data (www.open_website.com/api/action/package_search - handled, www.open_website.com/additional/endpoint/api/action/package_search - not handled). We need additional config parameter for giving a proper endpoint route to CKAN API.
Some resources pulled from CKAN can not have resource.name. So we need some logic for this cases. If it has resource.name - than data entity will have it as a name, if not we construct it like "CKAN_{organizationname}{resource.id}" for minimal human-readability.
With big data pulling via adapter this error: #20 appeared.
CKANRestClient.get_datasets()
method: some sites doesn’t have query parameterinclude_private
on/api/action/package_search
API endpoint, so we need logic to remove it in case of failure. Also we are getting not all datasets as ofrows
argument restriction. We are getting only 10, maximum is 1000, so custom pagination implementation is needed.CKANRestClient.get_resource_fields()
method: CKAN API endpoint/api/action/datastore_info
is available only via extension. Assumption: in case site does not have the extension we were getting 400 Bad Request and adapter logic parsed it into str object, sois_response_successful()
was crushing duringresp.get("success")
as response was type = str. Logic should be fixed.