schemaorg / suggestions-questions-brainstorming

Suggestions, questions, and brainstorming
19 stars 15 forks source link

encodingFormat should be distinct from contentType (e.g. to describe data.csv.gz) #42

Open cboettig opened 4 years ago

cboettig commented 4 years ago

I am attempting to document aDataset in which the DataDownload objects are compressed CSVs, e.g. .csv.gz objects. What is the correct schema.org annotation in this case?

DataDownload includes the property encodingFormat (which is also already an inherited property on CreativeWork, though admittedly DataDownload type allows for multiple formats of the same data).

I believe the typical definition of "encoding" would be the compression algorithm here, e.g. as RFC 2616 defines the http header Content-Encoding. This is at odds with the schema.org definition of encodingFormat, which seems to refer instead to the content type (i.e. text/csv in this case), as evidenced by the suggestion to use a MIME media type (which refers to the underlying type, not the compression, as I understand it).

I suppose I could define the schema:encodingFormat as something like application/csv+gzip in this case, but that would seem to be a non-standard way of representing this information. Thoughts / advice much appreciated.

cboettig commented 4 years ago

(Just a note that this is related to schemaorg/schemaorg#1155, in which it appears that fileFormat was deprecated or collapsed into encodingFormat. This seems to have led to us losing the ability to distinguish between how content is serialized (csv, tsv, xml, json etc) vs how it is encoded (e.g. compression, as per RFC 2616 section 14.11)

RichardWallis commented 4 years ago

See issue #7 for the context of the move from the main Schema.org issue tracker to this repository.

smrgeoinfo commented 3 years ago

quick note-- schema:contentType is not an expected property of DataDownload... see also https://github.com/ESIPFed/science-on-schema.org/issues/131 and https://github.com/ESIPFed/science-on-schema.org/issues/132