frictionlessdata / ckanext-datapackager

CKAN extension for importing/exporting Data Packages.
36 stars 15 forks source link

MIssing metadata fields when exporting data package #49

Open danfowler opened 8 years ago

danfowler commented 8 years ago

Without no schema on each resource, these are technically not Tabular Data Packages, even if all the resources are tabular.

This dataset, https://datahub.io/dataset/period-table-4716738971 (imported using "Import Data Package"), for example, exports the following datapackage.json:

{
  "description": "Example Data Package featuring the periodic table.", 
  "license": {
    "type": "ODC-PDDL-1.0", 
    "title": "ODC-PDDL-1.0"
  }, 
  "title": "Periodic Table", 
  "keywords": [
    "atom", 
    "chemistry", 
    "element"
  ], 
  "resources": [
    {
      "url": "https://datahub.io/dataset/aa320a4f-9ae4-4bc7-95c9-f3703bd5ceec/resource/85be2f85-17a0-4447-87df-e54b47477557/download/data.csv", 
      "title": "data", 
      "name": "data", 
      "format": "CSV"
    }
  ], 
  "name": "period-table-4716738971"
}
danfowler commented 8 years ago

Related comment:

https://datahub.io/dataset/rockhampton-regional-council-bus-stops#comment-2824345324

I downloaded the datapackage.json, but it only had highlevel metadata, and did not contain metadata of the columns in the dataset. I also expected that the resource would be downloaded as one package with the metadata.

EarlButterworth commented 8 years ago

My assumption was that, as it was a datapackage that it would deliver on download as a datapackage, exactly as it had been uploaded; i.e. the one ZIP file containing the datapackage.json and the CSV. Instead, all that was delivered was a partial datapackage.json which did not contain the Resource Schema.

The aim is to bring simplicity to data publishing and consumption. Therefore a consistent user experience (UX) should be provided; i.e. what goes in is what comes out.

danfowler commented 8 years ago

Hi @EarlButterworth I'm creating a new issue based on your comment:

https://github.com/ckan/ckanext-datapackager/issues/52

Thanks!

Stephen-Gates commented 6 years ago

Agree with above. Testing using v1.0.0

Information lost includes:

package:

resources:

amercader commented 6 years ago

Implementation

We already discussed licenses in #62 and schemas are now fixed.

For the rest of fields, modify the ckan-datapackage-tools converter to store them as extras (except mediatype, which maps to mimetype on resources). Use these extras on the way out to generate the DP descriptor

Estimate

0.5 day