dcmi / dcap

DC Tabular Application Profile - supporting materials
28 stars 12 forks source link

Add definition of application profile #7

Closed nichtich closed 3 years ago

nichtich commented 5 years ago

I miss an introductory definition of "application profile". In my understanding it's a "schema that restricts an existing schema or model" but this is based on my terminology of "schema" and "model".

kcoyle commented 5 years ago

Hi, @nichtich, yes, thanks, this is something we definitely need to do. I have added a link to the W3C DCAT page on the Related Projects wiki page that gathered up some definitions that might be useful. That group decided on

"A named set of constraints on one or more identified base specifications or other profiles, including the identification of any implementing subclasses of datatypes, semantic interpretations, vocabularies, options and parameters of those base specifications necessary to accomplish a particular function."

That sentence is a bit thick and would probably be more readable if written as more than one sentence. We also may not wish to follow this example, but it is there for our contemplation.

nichtich commented 5 years ago

Thanks! I'd cut this down to the core of an application profile as "a set of constraints on one or more base specifications" (application profiles are special cases of specifications). The additional "to accomplish a particular function" may be relevant to distinguish application profiles from arbitrary profiles, but I think this more depends on point of view.

tombaker commented 5 years ago

Since 1999, "application profile" has most commonly been defined in the DC community as a variation on:

schemas which consist of data elements drawn from one or more namespaces, combined together by implementors, and optimised for a particular local application.

At the time it was felt that namespaces (only) define, while profiles (only) reuse, and it was considered good practice to document a profile separately from its underlying namespaces. Most APs in the DC style, DCAT included, still basically fit this definition, even if the vocabularies are not always presented in separate documents. The idea that one profile could extend another profile was introduced, in the DC context, in 2006, though it was felt that trying to nail down a notion of dependence between profiles too formally would be a can of worms.

A named set of constraints on one or more identified base specifications or other profiles

The DXWG definition only really makes sense if "constraint" is defined in the broadest possible sense. For example, one could argue that any given statement is a constraint on the set of all possible statements but, while defensible philosophically, such a high level of abstraction is not actually very helpful.

More helpful, in my opinion, is the distinction made in the DSP draft (2008), where "templates are used to express structures" and "constraints are used to limit those structures". To take an example from Mikael's DSP syntax:

<DescriptionTemplate ID="person" minOccurs="1" maxOccurs="1" standalone="yes">

   <StatementTemplate minOccurs="1" maxOccurs="1" type="literal">
      <Property>http://xmlns.com/foaf/0.1/name</Property>
    </StatementTemplate>

    <StatementTemplate ...>
    ...
</DescriptionTemplate>

This example specifies a template for statements using foaf:name with the constraints minOccurs, maxOccurs, and type. (ShEx makes the same sort of distinction but draws the line a bit differently: the statement template as a whole, with constraints such as minOccurs, corresponds to a ShEx triple constraint, while the description template is expressed in terms of ShEx shapes.)

If in DXWG terms, everything in a profile is a constraint, then the entire statement template above, plus its enclosing description template, would be considered constraints. And if a profile, in DXWG PROF terms, is the sum of its relevant PDF, HTML, TTL, SHACL, ShEx, and Schematron files, encompassing everything from namespace documents to user guides, then I guess these also express sets of constraints?

Bottom line: it feels reductive to define profiles only in terms of constraints. I would prefer that we stick with a definition closer to Heery and Patel 2000 - or come up with a definition that introduces an abstract (i.e., not ShEx- or SHACL-specific) notion of shape.

analice1pt commented 5 years ago

If I am designing a profile to release LOD, in general terms I want to express what properties I use, their ranges and domains in the profile (taking into account the ranges and domains defined in the respective schemas), types (which I may include in ranges) and other things such as cardinality.
If I am designing a profile to reuse data made available by others, I want to add more information, such as relations between that data, respective properties and respective constraints. For me an application profile is a (linked) open data model and it may be expressed more formally or less formally.

analice1pt commented 5 years ago

Just to clarify: when I mention "data made available by others", I refer to data coming from multiple sources that I do not control. This means dealing with data conformant to potentially different application profiles.

kcoyle commented 5 years ago

Here's another definition of "application profile":

An application profile is a document that specifies how metadata elements from existing data models, possibly including locally defined additions, are combined and reused for a particular application. It can be expressed as a technical document, a machine-readable schema or just a consistently applied informal set of conventions.

From: Osma Suominen, Nina Hyvönen. "From MARC silos to Linked Data silos?" https://doi.org/10.5282/o-bib/2017H2S1-13

tombaker commented 3 years ago

@kcoyle The definition above is worth another look. Maybe close this because the discussion has moved elsewhere?

kcoyle commented 3 years ago

Defined in TAPvocabulary.md as:

Profile

An application profile specifies the structures and metadata terms used in a dataset. At a minimum the profile must provide the data elements that make up the metadata definition; a profile may also include rules for validity, such as value constraints and element cardinality.

philbarker commented 3 years ago

Can we make it more clear that an application profile should reuse terms from existing vocabularies?

kcoyle commented 3 years ago

Good idea, Phil. How about:

Using terms from existing vocabularies, an application profile specifies the structures and metadata terms used in a specific dataset. At a minimum the profile must provide the data elements that make up the metadata definition; a profile may also include rules for validity, such as value constraints and element cardinality.

philbarker commented 3 years ago

That's better.