Open sabinem opened 3 years ago
So I don't know about that discussion (is it really a discussion? Wasnt it just a proposal...?) but checking both v2 and v3 of DCAT I'm not sure I understand the same thing.
First of all I would use the official definition - the statement "should all have the same content and only vary in language format or resolution" doesn't come from there (right?) and I find it to simplistic and a bit confusing...
I find this sentence super super important: "Nevertheless, the question of whether different representations can be understood to be distributions of the same dataset, or distributions of different datasets, is application specific. Judgement about how to describe them is the responsibility of the provider, taking into account their understanding of the expectations of users, and practices in the relevant community." (https://www.w3.org/TR/vocab-dcat-3/#Class:Distribution)
In general: is it really an issue which should be fixed..? Or should we just explain the concept better, but leave it to data publisher to decide?
I think we are on a difficult topic here. I for instance would personally try to avoid dataset proliferation (and thus NOT model different budget years as different datasets) because this makes in my opinion the data portal messier. But that's just a personal way of doing things.
My proposal is to explain, make examples and show best practices, but leave up to the data publisher how he manages the data.
@AFoletti Please see the discussion on DCAT about this topic: w3c/dxwg#1429. Your concerns seem to be shared by a substantial part of the DCAT and DCAT-AP community.
Class: dcat:Distribution Conformance Problem:
Details:
dct:coverage
ondcat:Distribution
can be expressed with DCAT-Vocabulary, see here: https://github.com/SEMICeu/DCAT-AP/issues/197 we were told that modeling dataseries with distributions of a dataset is considered an antipattern and the reference above was mentioned in that regard.dcat:DataSeries
(https://www.w3.org/TR/vocab-dcat-3/#Class:Dataset_Series) will be added, that might be the best way to model dataseriesProposal: