w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
153 stars 47 forks source link

A profile must be packaged as a self-contained artefact #228

Open dr-shorthair opened 6 years ago

dr-shorthair commented 6 years ago

Discussion moved from https://github.com/w3c/dxwg/issues/212

dr-shorthair commented 6 years ago

@agreiner commented 5 days ago

So, my concern with this is that I think profiles should be self-contained. One should not have to find parent profiles in order to make sense of a profile already in hand. I have no problem with a profile indicating its provenance, but any programmatic use of a profile is made unnecessarily complex if it requires finding parent profiles and parsing them separately. As a web app developer, I don't want to have to build a semantic engine in order to build an app that uses a profile to determine what inputs it can use.

@dr-shorthair commented 3 days ago @agreiner doesn't that miss the point? We re-use existing technologies in order that we don't have to reinvent/redescribe them. And it can never be fully self-contained unless you also include the definitions and axiomatizations &| shapes for DC, OWL, RDFS, RDF, URIs, etc etc, which is obviously not what you mean at all! I think we are talking about functions equivalent to include/import and frankly if and pre-processing a profile by getting the dependencies first seems totally reasonable to me.

@rob-metalinkage commented 2 days ago This is an important discussion about implementation - while the implementation and the model are different, we should see if the model should provide better support for certain implementation patterns.

Certainly profiles cannot be realistically created "flat" - and must be constructed from a hierarchy - and here I will cite several reasons, (others may exist)

the burden on clients to work out what is the same and what is different from complicated flat (self-contained) profiles is far greater than that to combine profiles services and catalogs can do all the work of creating flat views of profiles descriptive object languages all implement hierarchy Its far easier to write an extension with a few extra clauses than package up a and document a self-contained profile We probably lose semantic information about intent (even if it is possible to compare flat profiles for equivalence) comparison of flat profiles will be dependent on being able to parse the actual profile constraints lagnguage used. This is simply not feasible for text forms in current practice, and would require delving into the details of those languages where we think it can be done. specific constraints might need to specify where they come from - possibly creating a burden on us to choose a canonical profile constraints or a set of conformance criteria about the expressivity of such languages So, given that clients will want to access a flat view - is there anything we can do in the model to help:

for example, SKOS has skos:broader and skos:broaderTransitive

A profileOf B B profileOf C

should entail A profileOfC

and A profileOf B,C

is legal - but loses the hierarchy. If we make profileOf point to the immediate parent only, then its hard to refactor - to create an extra category of profiles of B with commonalities.

These are quite well known patterns - it would be great to agree on the most useful form for a "self-contained view" of a profile and how this may be defined (its probably a profile of the profile ontology :-) )

kcoyle commented 6 years ago

@dr-shorthair Could you provide a use case for when clients need to "work out what is the same and what is different"? Also, note that #212 is one of the requirements that does not have a use case; there is only the requirement for modularity (UC 5.3), and we have not voted to accept that yet.

dr-shorthair commented 6 years ago

@rob-metalinkage - that was part of your comment (moved over from #212) - can you respond?

kcoyle commented 6 years ago

@rob-metalinkage "Certainly profiles cannot be realistically created "flat" - and must be constructed from a hierarchy - and here I will cite several reasons, (others may exist)"

In fact, all of the profiles that I am aware of that exist today are indeed flat. So flat is an option, although not-flat may be another option. As Annette has repeatedly pointed out, not all communities wish to engage with the complexity of hierarchies, modularity, or inheritance. If flat works for them, then flat needs to be an option. If I interpret the Europeana use case correctly, they would like to signal compatibility but there does not seem to be the requirement for specific hierarchical relationships. (We should check with Antoine on this, of course.)

azaroth42 commented 6 years ago

Isn't DPLA-MAP a profile of the EDM profile, as one obvious example of a hierarchy?

kcoyle commented 6 years ago

@azaroth42 The question is whether there are conceptual hierarchies of profiles or if one must "include" levels of profiles to create the profile one wishes to work with. The arguments are similar to those related to software code: you can copy it, but then it changes outside of your application; you can include it "on the fly" but then changes can trip you up. Personally, I think that we shouldn't declare profiles to be one way or the other as different communities will make different choices in this matter.

rob-metalinkage commented 6 years ago

Hierarchy in definition and "flattening" by composition during implementation is a very well known pattern implemented in many different platforms. There are multiple examples cited of hierarchies of profiles in the Use Cases.

We can postulate that some alternative "flat only" architecture might possibly work, but the onus is now on proponents of that to demonstrate a practical implementation is possible and preferably show where communities have successfully implemented a system where a sufficiently large number of profiles co-exist based entirely on "flat" redundant constraint specifications. Anything less is more wishful thinking than evidence of need.

kcoyle commented 6 years ago

@rob-metalinkage All of the DCAT profiles are flat. If you are referring to profiles in some machine-actionable form, we don't currently have any in our examples so please add any that you know about to our list of resources on the home page. (Actually, the original BIBFRAME profiles were machine-actionable but 1) are not being used and 2) didn't include anything but vocabulary terms 3) are flat.)

rob-metalinkage commented 6 years ago

Sorry but this is just wrong. Dcat-ap imports dcat so ergo it is not flat.

I do agree that mechanisms to flatten are useful and implementations must be allowed to choose these options. I think perhaps you need to develop a specific use case around the provision of flattened packages (and there is always a point at which you decide the client just needs to know something) rather than trying to question the evidence for the existing practice of defining profiles in hierarchies.

A client that "knows" dcat might be able to use dcat-ap as a flat profile... but whatabout geodcat-ap or dcat-ap-it.

Nothing stopping dcat-ap-it being acaikable as a flattened graph containing all the dcat and dcterms axioms. These are distributions of the profile itself.. so would be great to have this Use Case articulated.

On Sat, 23 Jun 2018, 01:02 kcoyle notifications@github.com wrote:

@rob-metalinkage https://github.com/rob-metalinkage All of the DCAT profiles are flat. If you are referring to profiles in some machine-actionable form, we don't currently have any in our examples so please add any that you know about to our list of resources on the home page. (Actually, the original BIBFRAME profiles were machine-actionable but 1) are not being used and 2) didn't include anything but vocabulary terms 3) are flat.)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/dxwg/issues/228#issuecomment-399472946, or mute the thread https://github.com/notifications/unsubscribe-auth/AIR3YZINnmsjNwkPGOQdD02QJfFur8zYks5t_QcJgaJpZM4T2ZYK .

kcoyle commented 6 years ago

I begin to wonder if we are using the same terms with the same meanings. First, DCAT-AP, by which I refer to a PDF (that is darned hard to get to on the eu site so I've socked away my own copy which says it is: SC118DI07171, DCAT Application Profile for data portals in Europe, Version 1.1). From what I can tell it lists all of the terms that are included in the application profile, from a variety of namespaces, including DCAT (which itself lists terms from DCT and others). The RDF file (which I haven't examined in full) seems to also contain all of the classes and properties from DCAT-AP. This is what I would call "flat" in that the classes and properties have been copied to the DCAT-AP; there is no "import" function (e.g. OWL imports). Next, "import" to me means "just in time" inclusion of properties and classes during run-time, which does not seem to be how DCAT-AP works. So I would call DCAT-AP "flat" in that it is a single document or RDF vocabulary that is complete as written and does not require examining or importing from another source.

I suspect that what you are calling "import" I am seeing as "reuse".

What seems key here is "Nothing stopping dcat-ap-it being acaikable (available) as a flattened graph containing all the dcat and dcterms axioms". This may be the crux of the matter, and it is somewhat analogous to some things that we struggled with in the SHACL/shapes working group - to what extent one can expect that applications access and enforce rules ( domains, ranges) from the parent vocabulary. That I see as an interesting question for profiles (and other reuses). In the SHACL/shapes work it was decided to ignore any axioms not present in the actual graph being validated (included sub/super relationships). That said, I do believe this is a validation question as much or more than it is a profile question, and I'm not sure that it is something we can tackle in terms of providing guidance. It seems to be a general issue around "mix'n'match" vocabularies and beyond our scope. If there are authoritative discussions of this in W3C documentation we can refer to it (without trying to solve it).

That said, I may have mis-understood how DCAP-AP works, and if so I would appreciate pointers to what I have missed.

makxdekkers commented 6 years ago

@kcoyle Karen, you are right. The RDF file for DCAT-AP is 'flat' as it includes all axioms from DCAT, DC and others. The same is true, as far as I know, of most, if not all, national profiles based on the European profile.

rob-metalinkage commented 6 years ago

inclusion in a serialisation is an implementation concern. DCAT profiles are quite obviously declared with a hierachical specialisation model. Confusing implementation and description will not help with an understanding of "how things work".

specialisation hierarchies can be implemented by reference (each clause references its parent), by an "import" - such as owl:imports - these are platform specific , or by "duplication" - as in the RDF file for DCAT-AP.

A valid case for an ability to provide a flattened implementation is not a valid argument to reject a requirement for hierarchical definition. Instead we need a Use Case to support this proposed requirement for packaging.

I dont think however we can support a "MUST" statement - profiles may be a mixture of machine readable constraints and additional textually defined requirements, that cannot be packaged in a single constraint language.

kcoyle commented 6 years ago

Can we develop a positive statement from this, like:

Profiles are hierarchical in nature but MAY be published as stand-alone resources containing all needed elements and constraints, as required by expected use.

agreiner commented 6 years ago

This seems at least a SHOULD to me. As requirements go, I think it's pretty obvious that there needs to be a way to obtain the details of all the constraints implied by the profile. For practical purposes, I think it also needs to be easy to do so. I'm even thinking that we might want to say that a profile should not use another profile as a base specification, so that all the details are never more than one hop away.

kcoyle commented 6 years ago

Good point, and a good reminder that we haven't yet decided what style we will be using for this document, and whether we'll use MAY, MUST, SHOULD at all. One solution would be to couch things in terms of trade-offs: if you use another profile as a base specification, here's what you gain but here's what it costs.

I believe that some of the Europeana profiles are profiles of profiles. I'm going to start a page for an analysis of profile patterns (including publication methods) so that we have something to look at. This relates to this thread. I can only contribute a few examples but if others will take on DCAT APs and Europeana we might have a good selection. I'll announce when I think I have something that might work.

rob-metalinkage commented 6 years ago

+1 for SHOULD

note the many examples of profiles already shown in profiledesc examples. Some specific Europeana profiles there could be included.

The tradeoffs seem to relate to implementation not definition.. so first separate out the concerns and push implementation issues into the guidance doc not the requirements discussion.