w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
144 stars 46 forks source link

A mechanism must be available to identify conformance to each inherited profile given conformance to a profile that specialises it (6.1) #214

Open kcoyle opened 6 years ago

kcoyle commented 6 years ago

75

kcoyle commented 6 years ago

Changed to: If conformance to a profile is claimed, then it should be possible to confirm conformance to each parent profile.

For document: Add note to effect that data consumers willing to access datasets that conform with the parent profile should be able to infer this conformance from the statement that the dataset conforms to the child profile

agreiner commented 6 years ago

What about the case where a profile uses compatible pieces from two parent profiles that are not compatible with each other?

rob-metalinkage commented 6 years ago

Its certainly possible to create models that are not internally consistent (or ontologies that are not "satisfiable") but its not the job of the descriptor to dissallow this. Its possible that constraints specification langauges could provide mechanisms to check for internal consistency.

What profiledesc does do. however, and by design, is allow you to locate the pieces so that manual or automated consistency checks become feasible.

rob-metalinkage commented 6 years ago

@kcoyle profiledesc enables this - but its still the scope of the constraint language(s) to enable conformance checking. What profiledesc explicitly supports is for these conformance inferences to be made without having to understand all the constraints.

agreiner commented 6 years ago

I was reacting to the suggestion that one will by definition be able to infer conformance with the parent profile from the fact that it conforms with the child profile. I may have a different idea in my mind about what it means to be a parent or child than you have. I am thinking that, if we enable inheritance, one could use many existing profiles (as well as vocabularies) as parents for a single child.

rob-metalinkage commented 6 years ago

Should be fine - conformance with one parent should not break conformance with another - for example GeoDCAT-AP inherits from both DCAT-AP and GeoDCAT

If you have a situation where schemas are fixed and you cant do this kind of mixing, (e.g. typical XSD) , then this simply isnt possible for that platform to implement, but it doesnt mean the conceptual model can't allow it.

We might have a XSD based profile subclass that adds restrictions that it cannot inherit from two parents that constrain the schema structure (but may be able to inherit from multiple parents that constrain cardinality of elements within the schema)

Its all a matter of separating concerns so each piece doesnt end up with lots of complex qualifications and special cases. Profiledesc allows axioms to state conformance expectations, but doesnt allow these to be inferred from the arbitrary constraint languages used (include PDF and spreadsheet forms!)

dr-shorthair commented 6 years ago

@agreiner - indeed, we may need to work on this rule a bit. If there is more than one dependency (parent) and they address different aspects of the overall problem (e.g. by combining PROV-O and DCAT) then what does it mean to say that an instance of the profile is a valid instance of both parents? Under RDF OWA there is less difficulty, as you can always add information not anticipated in a vocabulary. But an individual that conforms to a SHACL or ShEx script that mentions elements from both parents might not conform to validation scripts that test conformance to each parent separately. And that's just in the RDF world. @rob-metalinkage points out that other issues arise on other platforms.

These are all reasonable requirements and reasonable technologies, so we I'm pretty sure we can accommodate this with just a little care. But we will also need to be reasonable about it and, while not ignoring the corner cases, make sure that the common cases drive our thinking.

kcoyle commented 6 years ago

Looking at profileDesc I don't see this functionality explicitly expressed. It may be inherent in roles, but then much depends on how far we wish to go in defining specific roles. It actually appears to me that at least for the FPWD we do not need to specify anything relating to inheritance, and in fact we may not need to in order to fulfill 1.0. If that is so, then I could imagine a note or some other document that specifically addresses how dependencies between profiles are handled when that is desirable. To be sure, creating profiles with dependences should not be required.

rob-metalinkage commented 6 years ago

Requirement 6.8.2 "RPFRP" starts off with the requirement - and these cannot be met be existing ad-hoc documentation of profiles (PDF, schematron, word docs, etc, nor SHACL ans SHex as far as i can tell)

  1. Machine-readable specifications of application profiles need to be easily publishable, and optimize re-use of existing specifications.
  2. Application profiles need a rich expression for the validation of metadata
  3. Profiles must have properties for at least two levels of documentation: 1) short definition 2) input and editing guidance
  4. Profiles must support declaration of vocabulary constraints
  5. A mechanism must be available to identify conformance to each inherited profile given conformance to a profile that specialises it.

(#2 points towards SHACL etc, but the other aspects motivate profileDesc)

On 11 May 2018 at 16:57, kcoyle notifications@github.com wrote:

Looking at profileDesc I don't see this functionality explicitly expressed. It may be inherent in roles, but then much depends on how far we wish to go in defining specific roles. It actually appears to me that at least for the FPWD we do not need to specify anything relating to inheritance, and in fact we may not need to in order to fulfill 1.0. If that is so, then I could imagine a note or some other document that specifically addresses how dependencies between profiles are handled when that is desirable. To be sure, creating profiles with dependences should not be required.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/dxwg/issues/214#issuecomment-388278224, or mute the thread https://github.com/notifications/unsubscribe-auth/AIR3YX8CKaxtsW6msfFm7YICpxyRHCzMks5txTZQgaJpZM4TiAu_ .

rob-metalinkage commented 6 years ago

Requirement 6.8.2 "RPFRP" starts off with the requirement - and these cannot be met be existing ad-hoc documentation of profiles (PDF, schematron, word docs, etc, nor SHACL ans SHex as far as i can tell) Machine-readable specifications of application profiles need to be easily publishable, and optimize re-use of existing specifications. Application profiles need a rich expression for the validation of metadata Profiles must have properties for at least two levels of documentation: 1) short definition 2) input and editing guidance Profiles must support declaration of vocabulary constraints A mechanism must be available to identify conformance to each inherited profile given conformance to a profile that specialises it. (#2 points towards SHACL etc, but the other aspects motivate profileDesc)

kcoyle commented 6 years ago

Rob, I don't actually see how profileDesc fulfills these requirements, and I think there is at least one requirement fulfilled by profleDesc that I don't see here.

For the latter, the main function that profileDesc fulfills is to connect the dataset itself to resources related to it, primarily any application profiles (in any form or serialization), documentation, validation, etc.

For the existing requirements, AFAIK profileDesc does not determine the content of the application profile, so does not define validation rules, documentation, profile vocabulary, etc. These are aspects of the profile itself and from what we discussed at the F2F the profile is a black box to profileDesc. If that's not how you see it then we need to take up that discussion again, perhaps in plenary.

rob-metalinkage commented 6 years ago

We may need to circle around to get a mutual understanding of context here: "connect the dataset itself to resources related to it" doesnt fit my understanding - the connection between a dataset and metadata would be through dct:conformsTo - the resource being pointed at may be a prof:Profile, a sucClass of dct:Standard. If it is a Profile, then profileDesc allows discovery of both how a specific profile relates to other standards and where the (many possible types of) actionable resource describing the profile live.

In this way it fulfils the need to be able to find profile details, which would be hard if these were attached in an ad-hoc way, and to understand the most basic things about them which the resources cannot yet describe - how they relate to each other.

so we still need to provide "guidance" on what types of profile description language might best suit profiles of DCAT, but we also need to allow DCAT to be descriptive of data interoperability (aka profile conformance) and as there is no existing vocabulary that anyone has identified, then profileDesc can extend the expressivity of DCAT, in the same way using ODRL, PROV or other special purpose ontology does.

(The discussion gets a bit recursive unfortunately - and similar concepts appear in different contexts. :-( )

kcoyle commented 6 years ago

dct:conformsTo references "an established standard", and doesn't seem to apply to a general resource. What I haven't thought through is whether is makes sense at both a general resource level or at a property level. I'm going to ping DC experts about that, but it definitely is defined as an object property.

rob-metalinkage commented 6 years ago

note that prof:Profile rdfs:subClassOf dct:Standard is consistent with Profiles having URI identifiers.

On 14 May 2018 at 20:09, kcoyle notifications@github.com wrote:

dct:conformsTo references "an established standard", and doesn't seem to apply to a general resource. What I haven't thought through is whether is makes sense at both a general resource level or at a property level. I'm going to ping DC experts about that, but it definitely is defined as an object property.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/dxwg/issues/214#issuecomment-388765880, or mute the thread https://github.com/notifications/unsubscribe-auth/AIR3YVaBHngIU1jxA6CgSuGXyWcawKAvks5tyVfKgaJpZM4TiAu_ .

kcoyle commented 6 years ago

DCterms is in the process of becoming an ISO standard, and the DC community has on its agenda to look at the differences in how dct:Standard is defined, to whit:

DCMI Metadata Terms defines the class "Standard" as follows:

A basis for comparison; a reference point against which other things can be
evaluated.

The draft ISO 15836-2 defines "Standard" as follows:

document, established by consensus and approved by a recognized body, that
provides, for common and repeated use, rules, guidelines or characteristics
for activities or their results, aimed at the achievement of the optimum
degree of order in a given context 

[Source: ISO/IEC Guide 2:1996]

If DC adopts the ISO definition, then the definition may or may not be applicable to all profiles.

I personally feel that the ISO definition is both over strict and at the same time lacks the key element of community and a commitment to maintenance. But I don't know which way things will go.

kcoyle commented 6 years ago

@rob-metalinkage Although profiles must have IRIs to be available for content negotiation, can we exclude profiles that, for example, exist on paper? The answer may be "yes" for DCAT profiles, since DCAT is an ontology. Our definition does not specify that every profile is online. If we are limiting to that, it needs to be in the definition.

makxdekkers commented 6 years ago

As far as I am concerned, the proposal at DCMI to change the definition of dct:Standard, in effect narrowing it, is exactly the wrong thing to do. It would be also in conflict with the DCMI Namespace Policy, section 3C, as it would be contrary to what is stated: "if [...] such changes of meaning are likely to have substantial impact on either machine processing of DCMI terms or the functional semantics of the terms, then these changes will be reflected in a change of URI for the DCMI term or terms in question". I know of several implementations that would break because of the narrower definition.

kcoyle commented 6 years ago

@makxdekkers I would love to get some concrete examples from you that I can bring to the DC usage board.

makxdekkers commented 6 years ago

DCAT-AP uses dct:conformsTo with range dct:Standard in three places:

In neither of those cases it is expected (although not excluded) that the object is a "standard" in ISO terms.

kcoyle commented 6 years ago

Thanks, @makxdekkers. Have added that info to DC usage board discussion.

rob-metalinkage commented 6 years ago

The proposed ISO wording also hinges on the definitions of "document", "recognized", "body" etc. too. Can a "community of 1 person today, and some unknown number of future collaborators opting in" define a "standard" in the scope of their own activities. They can look in the mirror and recognise themselves :-) . All it implies is if they publish something there is a "reference point" as per the original scope. That said, it feels certain to derail the expectations of some people reading it.

kcoyle commented 6 years ago

I would lay money on the ISO wording of "recognized body" = ISO and others of that nature. (W3C, NISO, IFLA, etc.) The definition has an air of being self-serving and from ISO's rather narrow view of standards. I don't think that DC is going to adopt that language, but it's under discussion.

One possibility for the DC term "conformsTo" is to remove the range "dct:Standard" so that one can state any kind of conformance without running into the question of "is this a standard?". I've made that suggestion to the DC group that is reviewing DC terms.

makxdekkers commented 6 years ago

@kcoyle That sounds very helpful -- so instead of narrowing down the definition of dct:Standard, which I think is the wrong thing to do, the solution could be to widen the definition of dct:conformsTo. Good!

andrea-perego commented 6 years ago

@kcoyle said:

@rob-metalinkage Although profiles must have IRIs to be available for content negotiation, can we exclude profiles that, for example, exist on paper? The answer may be "yes" for DCAT profiles, since DCAT is an ontology. Our definition does not specify that every profile is online. If we are limiting to that, it needs to be in the definition.

I agree. I would very much careful in tightly binding the requirement for a profile to have a URI with the one requiring such URI to be dereferenceable.

Also looking at profile conneg, just having a URI for a profile would be a dramatic improvement wrt the current situation and it would cover the most common scenarios - e.g., where clients know which profile they are looking for, and they are not interested in getting a profile description and/or definition (they just want a resource, represented by using the requested profile).

I see the ability to dereference a profile URI as an additional (more advanced) requirement, bound to use scenarios requiring to get access to the profile description/definition.

If we put them together, this may have the unintended result of limiting the adoption of and the support to the "main" one (a URI for a profile). We'd better be modular.

kcoyle commented 6 years ago

I agree that for content negotiation we may need to require that profiles be online and accessible, but is there a reason why one could not cite a profile that exists as a document on a shelf, just as one would cite a book or a journal article? Surely there will be times when we cite standards that aren't available online, and profiles have some of the same characteristics of standards. (No, I don't know what that would look like, but I'm just not sure that we should cut off any possibilities.)

larsgsvensson commented 6 years ago

My intention always was that the requirements are that "a profile MUST have a URI" and that "the URI identifying the profile SHOULD be an http URI so that you can look up more information about it".

I see no reason that there can be (at least) two kind of profile-aware servers

  1. Those that are hard-coded to work with a specific (set of) profile(s) and
  2. Those that can actually look up a profile, evaluate the constraints and then see if it can tweak the data in order to match the requirements set in the profile.

This would be similar to schema-aware XML-processors that can also be hard-coded ("specialized to one or a fixed set of pre-determined schemas") or general-purpose ("those not specialized to one or a fixed set of pre-determined schemas"); cf. XML-Schema 1.1 §4.3.2. I guess that profile-aware servers will be of the first kind, at least in the beginning.