ckan / ckanapi

A command line interface and Python module for accessing the CKAN Action API
Other
176 stars 74 forks source link

Provide example of how to use the new package_revise #181

Open mcarans opened 3 years ago

mcarans commented 3 years ago

How do I use the new CKAN 2.9 function package_revise?

Currently to do a resource update with file upload, I can do something like:

file = open('/tmp/datafile.xlsx', 'rb')
files = [('upload', file)]
call_action('resource_update', {'id': 'my uuid', 'name': 'MyResource'...}, apikey=xyz, files=files)

Let's say I have a dataset with name "MyDataset" I want to update, removing resources 0 and 2, updating the description of resource 1 and uploading a new file for resource 1. I understand from documentation that the parameters would be something like:

    match = {"name":"MyDataset"}
    filter = ["-resources__0", "-resources__2"]
    update = {"resources__1__description": "My new description"}
    update__resources__1__upload@/tmp/datafile.xlsx

Can you show in code how I pass those things in the call_action? For example, do they all go in the second parameter (data_dict) and if so, how does the file upload one go in there since it isn't a key value pair? Do I still need to use the "files" parameter?

wardi commented 3 years ago

The remote CKAN action API may be called in one of three ways:

  1. GET request with string parameters
  2. POST request with JSON body
  3. multipart POST request with string parameters and file attachments

only number 3 can be used when uploading files, so JSON can't be used to nest the other parameter values sent as part of the same call. package_revise works around this by also accepting JSON as a string for the top level parameters. e.g.

myckan.call_action(
    'package_revise',
    {'match': '{"name":"MyDataset"}'},  # values need to be JSON strings when uploading files
    files={'update__resources__1__upload': myfile},
)

I could update ckanapi to automatically convert lists and dictionaries at the top level of multipart POST requests to their JSON versions, but as far as I know only package_revise accepts this automatically.

mcarans commented 3 years ago

Thanks! Is there any reason to use package_update any more?

wardi commented 3 years ago

no, but package_update does have a simpler interface for replacing all the metadata fields in a dataset.

ghost commented 3 years ago

What about resource_create? Does it perform something additional what package_revise doesn't; I mean is it safe to replace resource_create API-calls with package_revise?

paulmueller commented 2 years ago

@ghost package_revise is safe, because it allows concurrent calls, whereas resource_create may result in data loss if other actions (including package_revise) are changing the dataset/package at the same time.

[EDIT: I guess it's the right time here to talk to ghosts]