Closed init-dcat-ap-de closed 2 years ago
Actually, AFAIK this is by design and also applies to dcterms:conformsTo
specifying the schema of the distribution.
The use case here is to allow e.g. 1 CSV file to be compressed using gzip (compressFormat), or a set of files of same format and schema, split e.g. for size reasons, into multiple files, but packaged as one (packageFormat).
If each packaged file would be different, there is no possibility of properly describing the format and the schema of the individual packaged files with a dcat:Distribution. Therefore, the use case here should not be "zip anything you want into one file and you have a proper distribution".
Also, having multiple formats and schemas would not solve this, as there would be no way of saying which format and which schema goes with which packaged file.
An exception would be if the contents of e.g. the zip file is standardized (e.g. a .docx file is actually a zip file and a package). But then this has a special media type.
Maybe this should be included as a note somewhere in the document?
While I agree that this is not optimal, it is the reality. If I search for ZIP distributions (https://data.europa.eu/data/datasets?locale=de&format=ZIP) I can find e.g.: https://data.europa.eu/data/datasets/movimento-migratorio-cancellati-dei-cittadini-stranieri-in-anagrafe-per-sesso-anni-2003-2013
This are a collection of different file types. Most "ZIP" files I find are actually Shape files...
Yes. Then the question is whether this is just bad usage of DCAT (which is my opinion) or something we should aim to support. Then we would probably need to make dcat:Distribution
way more complex and the question is, whether the publishers currently publishing the zip files would be willing do describe its contents using this more complex approach.
I am not sure if it would be necessary to make the description way more complex. If the zip-file is a bundle of various files, it will probably never be possible to work with the content automatically, just from the meta-data-description.
Yes, if we would not aim at automatic use / proper description, then it could stay similar. I am however still struggling with saying that this is a supported case. I would rather say "think again about how you structure your data so that it can be described properly".
Hello,
with the inclusion of
dcat:packageFormat
anddcat:compressFormat
shouldn't it be possible to have multipledct:format
ordcat:mediaTypes
?Otherwise we would imply that all files within a zip have the same format.