w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
153 stars 47 forks source link

Revisiting the definition of "profile" #963

Closed kcoyle closed 5 years ago

kcoyle commented 5 years ago

As per the meeting of June 25, 2019, this is an area for the discussion of the definition of "profile". Let's put each definition in a separate comment so people can up or down vote them.

rob-metalinkage commented 5 years ago

It feels like we are at last getting to grips with this beast :-).

It seems the concept of specification is made more complex by the sense its something to conform to, and the use of it for documents that do more than specify things, or may contain multiple such specifications.

Simon was on the money when he explicitly modelled "conformanceClass" as the specifying component of a specification document - but thats not really a "mass market" accessible set of terminology.

In my mind I'm happy to think of the specification component of a specification document as a "profile" - so a specification document may have a mandatory profile and a recommended profile - or multiple discrete cases of each of these.

I think we cant fix use of "specification" as a noun (i.e. define it well enough) because of its conflicting but common usage senses - but as a verb it does describe a role well enough.

More than happy to try to explicitly model the idea of a specification document containing multiple profiles - the OGC example shows how a naming policy can be enforced - but that's a corner case against legacy documents.

I think we have a choice: 1) define a convention and mechanism for describing how "specification document" objects can be polymorphic (act as a mandatory profile and a recommendation profile and a guidance note and a set of examples etc) - nb I think this can already be handled in Profiles ontology using roles, but we may want to define role like "mandatoryCore" - so a profile can state it represents that part of a document. 2) limit scope to cases where the equivalent of "conformanceClass" is explicitly named - we just use those names as references - profiles or X ('specification' seems not to work) 3) specify a mechanism to provide names for the different components within a specification document and define how each named thing is related to the original document. 4) some other idea ?

Once we have consensus about the relationship of profile to "specification" I can model options to describe - but I feel that my original "computer scienc-y" view that each "conformanceClass" is naturally a profile of itself can be extended happily to "each specification document describes a number of possible profiles of the specification" - and we just need a single notion of profiles - as per conneg-by-ap (and consistent with @tombaker comments on this matter)

aisaac commented 5 years ago

@rob-metalinkage just checking: is this comment really for this issue? As it mentions "documents", "specifications", "conformance classes" and "documents that contain multiple specifications", that makes me think of #978

rob-metalinkage commented 5 years ago

I should have put the link to #978 into this issue - that UC is input into solving this issue - trying to untangle the conflated notions we have identified...

aisaac commented 5 years ago

@rob-metalinkage it is still not too late: you could move your comment to #978 and replace it by a link?

rob-metalinkage commented 5 years ago

no the comment belongs here - its all about the definition - which relates to what the thing is (i hope)

tombaker commented 5 years ago

UPDATED PROPOSAL

Taking into account the discussion above, here is an updated proposal for a definition of "profile" to use both in DCAT and CONNEG:

data profile

    A data specification that constrains, extends, 
    combines, or provides guidance or explanation 
    about the usage of other data specifications.

data specification

    A specification, with human- and/or
    machine-processable representations, that defines
    the content and structure of data used in a given
    context.

As per @aisaac's concerns about the ambiguity of "annotates", I suggest we add something like: "In this context, 'annotate' means 'to provide guidance or explanation about usage'."

By my count, circa eight out of the eleven active members in this discussion have expressed support for these definitions (or a variant thereof), either by upvoting in Github or by commenting on calls.

I agree with Antoine that the definition of "profile" should acknowledge and cite two other usages of "profile" in a data context -- JSON-LD "profiles" ("syntactic profiles") and "data profiling" (a data analysis activity) -- if only to say that they are out of scope.

See https://lists.w3.org/Archives/Public/public-dxwg-wg/2019Jul/0201.html for further discussion

RubenVerborgh commented 5 years ago

I was alerted to this discussion by @tombaker's mail kindly forwarded by Peter. I was surprised to see that the profile definition is indeed still a concern. I had already voiced a strong opinion against the definition of "named set of constraints on one or more identified base specifications", which non-related concepts like a programming language also satisfy (see https://www.w3.org/2017/dxwg/wiki/ProfileContext#Comments.2Fobjections).

I think the above definitions by @tombaker are indeed closer to what we want. But I have some concerns there regarding it being a recursive definition. I don't think the essence of a profile (there being defined as a data specification) is that it related to another specification. However, I see above that the notion of "profiling is about acknowledging a reuse" is strongly present.

FYI, this is what the IETF draft currently says:

In the context of this proposal, a profile is a description of the structural and/or semantic constraints of a group of documents.

The interesting difference being that the above definition also allows "extends" rather than "constrains"; however, I don't see this as a contradiction, as the IETF definition talks about constraints with regard to documents, not other profiles.

What matters from a purely technical perspective is that a profile constrains a document beyond a media type; other parts of the definition might indeed matter for usage, but not down at the negotiation level. So as long as there is no contradiction, all is fine.

aisaac commented 5 years ago

@tombaker I would really push for replacing "annotates other specification" by "provides guidance or explanation about usage of other specifications". The addition is not very long and "annotates" can be really confusing.

aisaac commented 5 years ago

@RubenVerborgh the problem you see with the recursive definition is one of naming. We agree that we need two things, one for specs based on other specs and one for specs that could be self-standing (https://lists.w3.org/Archives/Public/public-dxwg-wg/2019Jun/0106.html). I was rather in favour of naming "data profiles" the latter, which I believe matches your "I don't think the essence of a profile (there being defined as a data specification) is that it related to another specification".But it seems that more persons are in favour of keeping "profile" for the "recursive" case. But well again it's a mere issue of naming, and I'd personally be ok changing that in the coming months if we discover something new. What's important to me is the split in two definitions, which captures the main divide we've got in our specs at the moment.

RubenVerborgh commented 5 years ago

Thanks, @aisaac. I don't mind that strongly; I am only involved with the IETF part, and as long as we have compatibility (IETF will be more generic), then all is good.

aisaac commented 5 years ago

@rob-metalinkage I think I'm still a bit reluctant including your comment in the base definition. It's relevant, but for me it's the next step. I.e. we first agree on something short and then we expand on the details of how profiles can exist, what their components (documents) are and how they work together. What I have in mind is the current structure of the Profile Guidance draft, where the base definition is in section 1 and then sections 2 3 and 4 dive into further characterization of profiles, which would include matter from #978 (which is why I'm so interested in having all your thoughts on #978 attached to that issue and not disseminated into the one here).

pwin commented 5 years ago

I want to include into @tombaker proposal the word "combines" as first mentioned by @andrea-perego in https://github.com/w3c/dxwg/issues/963#issuecomment-508577664 as many APs such as CPSV-AP etc are just that. they don't constrain, extend, etc. They just aggregate a few building blocks defined elsewhere.

tombaker commented 5 years ago

@pwin Since I was the only one who had yet updated my proposal (above), I have added "combines". @aisaac Fine with me to replace "annotate" with something more explicit.

I have amended the proposal.

pwin commented 5 years ago

https://www.w3.org/2002/09/wbs/99375/profile-def/

Please can colleague vote on this

andrea-perego commented 5 years ago

@pwin , I confess it is not clear to me which is (are) the definition(s) we are voting on.

Can we have it/them explicitly included here in a specific comment?

pwin commented 5 years ago

@andrea-perego The definition is in 2 parts, but is a single definition;

data profile

A data specification that constrains, extends, 
combines, or provides guidance or explanation 
about the usage of other data specifications.

data specification

A specification, with human- and/or
machine-processable representations, that defines
the content and structure of data used in a given
context.

The vote is a simple Yes/No to the question, are you happy that we use this for conneg and dcat

kcoyle commented 5 years ago

I have to say that I really like the essence of this clear, simple definition from the IETF document:

In the context of this proposal, a profile is a description of the structural and/or semantic constraints of a group of documents.

I'm a bit unclear on the "of a group of documents", but I like the "structural and/or semantic".

Also, I'd be happy to shorten the statement "or provides guidance or explanation about the usage of" in Tom's proposal because the "or"s there saddle us with some language ambiguity: "(or provides guidance or explanation) about the usage" or "(or provides guidance or (explanation about the usage)".

If we approve the definition, I'd like to have a chance to word-smith to make sure that it is clear. My gut tells me that we don't need both guidance and explanation in there, so we could clean that up easily.

RubenVerborgh commented 5 years ago

I'm a bit unclear on the "of a group of documents"

Yes… the intent was "certain documents". Will change that.

but I like the "structural and/or semantic".

This was specifically meant to contrast with media types, which additionally provide syntactic constraints.

tombaker commented 5 years ago

@RubenVerborgh I kinda like the IETF definition too, but I'm also unclear on "of a group of documents" because the wording seems to support two quite different interpretations:

In any case, the term "document" is both very broad and very specific. It is specific because it implies "file", but it is broad because what print or digital file is not a "document"? I take "group of documents" to refer to what we have been calling a "specification", which might consist of a family of related expressions of, say, a vocabulary, e.g. in RDF and PDF.

@kcoyle Good catch re: the ambiguity of "provides (guidance or explanation) about the usage" versus "provides guidance or (explanation about the usage)".

RubenVerborgh commented 5 years ago

The IETF definition is now:

a profile is a description of structural and/or semantic constraints documents can conform to in addition to the syntactical interpretation provided by more generic MIME types.


It is specific because it implies "file"

It actually means "representation" to me, in the REST sense of the word, but I thought that would be too specific.

I take "group of documents" to refer to what we have been calling a "specification"

That was intended as "certain documents that conform to the specification"; clarified now.

smrgeoinfo commented 5 years ago

My reading of that IETF definition is that it defines 'profile' to only apply to profiles of MIME types-- certainly a narrower scope that our discussions here. On close parsing: a profile is a DESCRIPTION of constraints {on documents that have MIME types}. Use of 'CAN' is not standard specification language; my reading would interpret CAN to denote 'possible', not permitted (MAY), recommended (SHOULD) or required (MUST).

I don't think it's suitable for our purposes.

RubenVerborgh commented 5 years ago

My reading of that IETF definition is that it defines 'profile' to only apply to profiles of MIME types

That's not what it says though.

my reading would interpret CAN to denote 'possible'

Actually, yes. It is possible that documents conform to a profile. If we change the can into a MAY, then the constraints seemingly becomes optional. Let me see if I can further refine.

rob-metalinkage commented 5 years ago

I think I can live with this - (its two definitions - but it has proven necessary to state that there is a special case of specification that relates to "data used in a given context" - this seems to match OK with "certain documents" in the IETF version - and propose a useful clarification.

The only problem is that it may still be too hard to see how the (undefined) term "usage" resolves the ambiguity between re-use and profiling - you need to go through to the end of the data specification definition to the "data in a given context" and successfully infer that therefore a profile must also be constraining the same "given context". IMHO it would be useful to state this up front so the definition implications are clear, and vocab re-use is not swept up in a too broad definition of profiles.

I therefor suggest something along the lines of

A data specification that constrains, extends, combines, or provides guidance or explanation about the usage other data specifications whereby the "given context" or the profile remains compliant with these "base" specifications.

without this explicit clause it will be too hard for the definitions to be interpreted as a whole, and the ambiguity of the word "usage" is too great - would i be using something if I made a statement like?

:myClass owl:disjointWith you:yourClass

(yes - i'd be using it to clarify semantics of the thing i was defining - but i'm not profiling it in any sense)

(it would be easier if we had a definition equivalent to "conformanceClass" that we could make simple statements about, but I dont beleive we have identified an acceptable term for this "given context"

akuckartz commented 5 years ago

What is meant with "a description of" ?

kcoyle commented 5 years ago

@rob-metalinkage

to the "data in a given context" and successfully infer that therefore a profile must also be constraining the same "given context".

I do not believe that this is an inference that can be made from Tom's definitions. The definition of specification is not about A specification, but specifications in general. The definition of profile is that it IS a specification. There aren't two different things here so there's no "same" to be read into it.

rob-metalinkage commented 5 years ago

Thats not the point at all. Its the "context" that must be consistent with the base specification. Of course context is an undefined term introduced by this definition but i am happy enough with it, except that other people dont seem to understand that its the important part of the functional requirements for profiles.. i dont think "conformance target" is great.. but at least it highlights its an important concept.

Also its a syllogism to impute that what can be inferred from an instance because of the definition of the class is not about that definition...

kcoyle commented 5 years ago

@rob-metalinkage My point was that there is no "base specification" in Tom's definition. The relationship between specification and profile is an IS A relationship: a profile is a kind of specification, like a dog is a kind of mammal. There is no "specification" that a profile is a profile of, at least not in that definition. Whether that concept is included in, say, the profiles guidance document is not excluded, but the definition here is unrelated to that, and therefore there is nothing that must be "consistent". Specification is a class, not an instance, and a profile is a kind of specification, and is consistent with the definition of specification but not with any instance of specification.

nicholascar commented 5 years ago

The relationship between specification and profile is an IS A relationship As per PROF so far:

:Profile rdf:type owl:Class ;
         rdfs:subClassOf dct:Standard ;

So yes, every prof:Profile IS A dct:Standard but also:

:isProfileOf rdf:type owl:ObjectProperty ;
           rdfs:domain :Profile ;
           rdfs:range dct:Standard ;

And it's this that really makes prof:Profile something a bit more than just a rewrite of the dct:Standard class. Profiles really are a profile of something, even if that profiling is trivial (a Standard being a profile of itself).

Don't forget the expected use of:

<something> dct:conformsTo <Profile_Y> .

So things can conform to profiles, identified by some URI.

Am I right in asserting that the above, very long discussion, won't require any changes to these core concepts in PROF?

dr-shorthair commented 5 years ago

The global constraint is not enough. It merely says 'if there is a prof:isProfileOf relationship then it is from a prof:Profile to a prof:Standard'. But it does not require that the relation be present in every instance. To achieve that you also need a local constraint on the prof:Profile class

prof:Profile
  rdfs:subClassOf [
      rdf:type owl:Restriction ;
      owl:minCardinality "1"^^xsd:nonNegativeInteger ;
      owl:onProperty prof:isProfileOf ;
    ] ;
.

Aside: IS A is ambiguous. I assume you mean 'is a kind of', rather than 'is an instance of'

larsgsvensson commented 5 years ago

@dr-shorthair scripsit:

To achieve that you also need a local constraint on the prof:Profile class

Or you create a profile and declare the constraint using SHACL:

ex:ProfileShape
    a sh:NodeShape ;
    sh:targetClass prof:Profile;    # Applies to all instances of prof:Profile
    sh:property [                 
        sh:path prof:isProfileOf ;
        sh:minCount 1 ;
        sh:class dct:Standard ;
    ] .
.
dr-shorthair commented 5 years ago

If you don't have a canonical URI or even artefact, for the dependency, you can always describe it, in a blank-node if necessary:

<> a prof:Profile ;
   prof:isProfileOf  [ a prof:Standard ;
      dct:description "lots of words here if necessary" ;
   ] .

If you want to define 'profile' using a formal, RDF-based notation then you should follow it through all the way.

kcoyle commented 5 years ago

https://stackoverflow.com/questions/2218937/has-a-is-a-terminology-in-object-oriented-language http://java8.in/is-a-relationship-and-has-a-relationship/

IS A and HAS A are fairly commonly used both in O-O and in the preceding philosophical branch. It refers to typing, and therefore is a class relationship. I don't know of an equivalent that indicates "instance of" although now that I think about it, "instance of" sounds ambiguous to me (it would have to be an IS A relationship). I'm going to ponder that one for a bit.

kc

On 7/14/19 10:10 PM, Simon Cox wrote:

The global constraint is not enough. It merely says 'if there is a |prof:isProfileOf| relationship then it is from a |prof:Profile| to a |prof:Standard|. But it does not require that the relation be present in an instance. To achieve that you also need a local constraint on the |prof:Profile| class

|prof:Profile rdfs:subClassOf [ rdf:type owl:Restriction ; owl:minCardinality "1"^^xsd:nonNegativeInteger ; owl:onProperty prof:isProfileOf ; ] ; .. |

Aside: IS A is ambiguous. I assume you mean 'is a kind of', rather than 'is an instance of'

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/dxwg/issues/963?email_source=notifications&email_token=AAL53YLURLZZ452V4Y3LTHLP7QBEJA5CNFSM4H36FLV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ4XHSI#issuecomment-511275977, or mute the thread https://github.com/notifications/unsubscribe-auth/AAL53YPK25NSEQRXIDBK5RDP7QBEJANCNFSM4H36FLVQ.

-- Karen Coyle kcoyle@kcoyle.net http://kcoyle.net skype: kcoylenet

larsgsvensson commented 5 years ago

@nicholascar scripsit:

Am I right in asserting that the above, very long discussion, won't require any changes to these core concepts in PROF?

As I see it, yes: Every profile is a kind of standard but not every standard a kind of profile, so the axiom holds:

:Profile rdf:type owl:Class ;
         rdfs:subClassOf dct:Standard ;

And yes, given

:isProfileOf rdf:type owl:ObjectProperty ;
           rdfs:domain :Profile ;
           rdfs:range dct:Standard ;

a standard can be a (trivial) profile of itself. So

:aSomething dct:conformsTo :aProfile .
:aProfile a prof:Profile ;
    prof:isProfileOf :aStandard .

also means that aSomething conforms to :aStandard. (aside: This brings up the question if prof:isProfileOf should be declared transitive and also if there is a rule (:a dct:conformsTo :b , :b prof:isProfileOf :c) -> :a dct:conformsTo :c)

aisaac commented 5 years ago

@nicholascar yes I believe that all these long discussions don't change what we've tried to specify for some months already. Also the profiles use cases and the profile guidance draft won't be impacted that much I believe.

I'm not a big fan of profiles being profiles of themselves, but as I see it in the formal definitions floated around this is merely a possibility not a duty so I can live with that :-)

@dr-shorthair @larsgsvensson yes I guess we can either define the Profile class with OWL and/or SHACL the way you've done it or similar, and that should end in the PROF ontology. Perhaps it's already there in fact. Btw just checking: your two definitions are equivalent, aren't they?

larsgsvensson commented 5 years ago

@aisaac scripsit:

Btw just checking: your two definitions are equivalent, aren't they?

Yes, I think they are.

kcoyle commented 5 years ago

Decision (13 for, 1 against) in poll. Final text is:

Data Profile

A data specification that constrains, extends, combines, or provides guidance or explanation about the usage of other data specifications.

Data Specification

A specification, with human- and/or machine-processable representations, that defines the content and structure of data used in a given context.

aisaac commented 5 years ago

@kcoyle not sure we can put it due for closing. PROF is ok wrt having this definition represented, but Profile Guidance has not been updated yet. Maybe we can just remove the PROF label so that it doesn't stand in the way of PROF?

nicholascar commented 5 years ago

@aisaac better not to remove the label as we’ll loose the association so marked the “prof-due-for-closing” so the full “due-for-closing” can be added when Profile Guidance gets there.

kcoyle commented 5 years ago

This was to discuss the counter-proposal that Rob added to the poll on the definition. As the results of the poll were 13 for the definition based on Tom's definition, and 1 against (Rob), I think we can close this, with the assumption that Profile Guidance will use the definition that 13 members voted for, not the counter-proposal.

aisaac commented 5 years ago

@kcoyle yes this is probably simpler. As a matter of fact I've just worked the new definition in the document that I (still) want to use to prepare for the next updates on Profile Guidance (https://docs.google.com/document/d/1Y4jP4SGZMnt63EpjTX11-hW6-3mxlaq1i-Lbiw4tx1M/)

kcoyle commented 5 years ago

Decision (13 for, 1 against) in poll. Alternate text has not gathered adherents. Therefore this issue to discuss the alternate text can be closed. Documents will use the final text, which is:

Data Profile

A data specification that constrains, extends, combines, or provides guidance or explanation about the usage of other data specifications.

Data Specification

A specification, with human- and/or machine-processable representations, that defines the content and structure of data used in a given context.