Closed signekb closed 2 months ago
Looking at the json at https://datapackage.org/profiles/2.0/datapackage.json, this is a json schema, describing the structure of the datapackage.json
file, right? But, if I understand correctly, our properties template file is more like a sample datapackage.json
file (with some fields empty, with some fields set to a default value). So in order to transform the schema to this template file it’s not enough to drop the fields we don’t need because the structures are more different. E.g., in the template file we want a resources
field on the top level (i.e. {"resources": […]}
), but in the schema we have {“properties”: {“resources”: {“items”: {...}}}}
.
Am I misunderstanding this ^^? If not, would it be easier to construct a template file by hand or does anyone have a better idea?
@seedcase-project/developers
Ah, I think you're right! It does need some rearranging to be used as a template. In my mind it would also be easier to construct a template file by hand instead of creating specific logic to rearrange the json at the url. Even though it would be nice if it could rely on the spec directly. What do you think, @lwjohnst86 ?
Or another thought would be to see if there's a library that can generate a sample json object based on a json schema? And then we could just drop fields we don't need. I looked very quickly but no immediate clear hits.
Well, it doesn't seem that hard to do, so I'm having a look at it today^^
So the role of create_properties_template()
function is to take the schema you mentioned @martonvago and to get it into the correct format to make a usable data package spec. So for instance, keeping only the stuff in properties
, moving them up a level in the nesting (so those items are first level), dropping all the subnested propertyOrder
etc items so that what we have left with is the exact spec needed for a folder being a data package. Editing the existing fields would be something like giving them a default value of nothing ""
.
Right now, the schema looks like:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Data Package",
"description": "Data Package",
"type": "object",
"required": [
"resources"
],
"properties": {
"$schema": {
"default": "https://datapackage.org/profiles/1.0/datapackage.json",
"propertyOrder": 10,
"title": "Profile",
"description": "The profile of this descriptor.",
"type": "string"
},
"name": {
"propertyOrder": 20,
"title": "Name",
"description": "An identifier string.",
"type": "string",
"context": "This is ideally a url-usable and human-readable name. Name `SHOULD` be invariant, meaning it `SHOULD NOT` change when its parent descriptor is updated.",
"examples": [
"{\n \"name\": \"my-nice-name\"\n}\n"
]
},
...
But it should look like:
{
"$schema": "https://datapackage.org/profiles/1.0/datapackage.json",
"name": "",
...
Create dataclasses to represent the top levels of the data package spec, including most* package and resource properties, but excluding
resource > schema
and everything under there.* which ones we want to include is a work in progress
Dataclasses:
We want to be able to instantiate the classes with no arguments, so provide sensible default values.
Ideally, the classes should be generated automatically based on the data package spec.