w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
139 stars 55 forks source link

So, how do you link a CSV file to its documentation... #1593

Open hoehrmann opened 3 months ago

hoehrmann commented 3 months ago

I think https://www.w3.org/TR/2024/CR-vocab-dcat-3-20240118/ should be returned to the Working Group for further work.

While I can read in the document that I can use dcterms:conformsTo to reference the JSON Schema documentation for a JSON file, and I can use dcat:endpointDescription to reference the documentation of an API, my purpose in reading through the document was to learn how to link a CSV file to a human-readable description of the structure of the file, or a JSON, XML, ... file in a custom format to a human-readable documentation of the structure, elements, attributes, properties, ... of the format. And I am coming up empty.

If I ever had trouble finding the documentation for an API that I know exists, then dcat:endpointDescription might be useful to have. But I never really had such trouble. I do, however, very much have trouble working with proprietary CSV, XML, JSON, ... structures that are published without any documentation. And I feel very strongly that it must be obvious from this specification how to link data distributions to their documentation.

rob-metalinkage commented 3 months ago

the profiles vocabulary [1] allows links to resources with role qualifiers. treating this as a polymorphic equivalent or subclass of Distribution solves this problem. Im not aware of anything else that does it in a canonical fashion.

see https://www.w3.org/TR/dx-prof/#related-dcat

and https://www.w3.org/TR/dx-prof/alignments/dcat.ttl

[1] https://www.w3.org/TR/dx-prof/

bertvannuffelen commented 2 months ago

In DCAT-AP, dct:conforms-to is noted to express such information. There is no specific guidance to how a dct:Standard should look like and what it should contain as it really depends on the publisher of the Dataset or Data Service.

Note that the dct:conforms-to in DCAT-AP Dataservice is connected to the protocol, the interaction, e.g. it is SOAP,REST JSON, WFS, etc. While the structure of the data is documented in the associated Dataset.

hoehrmann commented 2 months ago

dcterms:conformsTo is for »An established standard to which the described resource conforms.« I might say a CSV file conforms to RFC 4180 or the SDMX-CSV standard, but this is about human-readable documentation, dcterms:conformsTo is not suitable for that.

bertvannuffelen commented 2 months ago

@hoehrmann, personally I think this is more on how strict you interpret the notion Standard. If a city creates a specification for its datasets, then this can be considered in some respect as a standard. It is not published by an international standardisation body but it acts somehow in the same way.

For me it is a bit a grey zone but if the document is intended to describe the structure in some formal way it may be considered as a valid entry.