Very rough draft of a potential JSON schema we could use for datasets. Note that the file records are nested as a property within the JSON object rather than having multiple JSON objects for the same dataset but different files.
{
"type": "object",
"properties": {
"title": {
"type": "string"
},
"owner": {
"type": "string"
},
"pageURL": {
"type": "string"
},
"dateCreated": {
"type": "string"
},
"dateUpdated": {
"type": "string"
},
"license": {
"type": "string"
},
"description": {
"type": "string"
},
"tags": {
"type": "array",
"description": "Could make an array of objects with specifier for tags from original dataset, ones manually added and ones added by the pipeline",
"items": {
"type": "string"
}
},
"resources": {
"type": "array",
"items": {
"type": "object",
"properties": {
"fileName": {
"type": "string"
},
"fileSize": {
"type": "string"
},
"fileSizeUnit": {
"type": "string",
"description": "Could we do away with this prop and just enforce file sizes to be bytes?"
},
"fileType": {
"type": "string"
},
"assetUrl": {
"type": "string"
},
"dateCreated": {
"type": "string"
},
"dateUpdated": {
"type": "string"
},
"numRecords": {
"type": "number"
}
},
"required": [
"fileName",
"fileType",
"assetUrl"
]
}
}
},
"required": [
"title",
"owner",
"pageURL",
"dateCreated"
]
}
Very rough draft of a potential JSON schema we could use for datasets. Note that the file records are nested as a property within the JSON object rather than having multiple JSON objects for the same dataset but different files.
_Originally posted by @JackGilmore in https://github.com/OpenDataScotland/the_od_bods/issues/163#issuecomment-1268595248_