frictionlessdata / ckanext-datapackager

CKAN extension for importing/exporting Data Packages.
36 stars 14 forks source link

Schema included in a data package not added to CKAN #61

Closed Stephen-Gates closed 6 years ago

Stephen-Gates commented 6 years ago

A data package can include schemas for each data resource. When I upload a datapackage.zip file containing schemas I expect these to be stored in CKAN.

If they were stored they could be

GeraldGrootRoessink commented 6 years ago

Suppose we could import/export schema from/to datapackage.json to/from CKAN. Like that idea. I wonder though, how that relates to the field information provided bij de datastore-create-API and the datapusher-extension. For example, the test for the API provided in the CKAN documentation gives this result:

    "fields": [
        {
            "type": "int",
            "id": "_id"
        },
        {
            "type": "int4",
            "id": "a"
        },
        {
            "type": "text",
            "id": "b"
        }
    ],

Will this be compatible?

Stephen-Gates commented 6 years ago

@GeraldGrootRoessink this is related and worth a read https://github.com/frictionlessdata/ckanext-validation/issues/1

GeraldGrootRoessink commented 6 years ago

Worth a read indeed. My understanding is that ckanext-validation introduces with ckanext-scheming a new field schema that accepts a json-object like (without the whitespace): { "fields" : [{ "name" : "regel", "title" : "kolom 1", "type" : "string", "rdfType" : "http://lod.duo.nl/cdm/def/v0/regelnummer" }, { "name" : "brin", "title" : "kolom 2", "type" : "integer", "rdfType" : "http://lod.duo.nl/cdm/def/v0/BRIN-V02" } ] } So it should be possible to extract this part of het datapackage.json in https://github.com/frictionlessdata/ckan-datapackage-tools/blob/master/ckan_datapackage_tools/converter.py at line 116.

Am I right?

GeraldGrootRoessink commented 6 years ago

Me again. Added these lines after line 139

if resource.descriptor.get('schema'):
    resource_dict['schema'] = resource.descriptor['schema']

This works

amercader commented 6 years ago

@GeraldGrootRoessink That sounds like a good approach, can you submit a PR?

GeraldGrootRoessink commented 6 years ago

Submitted a PR. However the coverall/coverage checks fails at 96.465 because of a warning:

tests/test_converter.py::TestDataPackageToDatasetDict::test_datapackage_name_title_andversion /home/travis/build/frictionlessdata/ckan-datapackage-tools/.tox/py/lib/python2.7/site-packages/datapackage/package.py:420: UserWarning: Property "package.to_dict" is deprecated. UserWarning)

I'm afraid I'm in over my head with this one.

amercader commented 6 years ago

@GeraldGrootRoessink this is coverage complaining that you didn't add a test for your change :) Plus an unrelated warning which I fixed.

I did myself here, plus fixed the logic for the inverse path (dataset -> datapackage).

GeraldGrootRoessink commented 6 years ago

Super. For anyone who follows: It will only work nicely if there is a schema-field defined as a json-object. For example I managed this using https://github.com/ckan/ckanext-scheming.