w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
146 stars 46 forks source link

Consider use of adms:identifier instead of prof:token #453

Closed nicholascar closed 4 years ago

makxdekkers commented 5 years ago

@nicholascar Can you provide a description for the issue?

nicholascar commented 5 years ago

Sure: we have a real need for non-URI identifiers for profiles when we need to referee to them where URIs are unsuitable, say in Query String Args in an HTTP API. We have, till now, presented a token property for the Profile class pointing to an xsd:token object for this but couldn’t we just use an adms:identifier instead? We can then fully implement adms:Isentifier instances with format strong etc. if required.

andrea-perego commented 5 years ago

@nicholascar , I also was unsure about the purpose of prof:token, when profiles are supposed to have URIs. Now I better understand the point.

However, it is still unclear to me how this token will be defined: will it be a global identifier (and, in such a case, how this can be ensured?), or it will be rather specific to the service / API? In the latter case, I don't see how the token can be associated with the profile without avoiding possible collisions.

agreiner commented 5 years ago

A URI can be used in a query string by urlencoding or something like base64 encoding. Maybe that's all one would need for a token.

andrea-perego commented 5 years ago

@agreiner said:

A URI can be used in a query string by urlencoding or something like base64 encoding. Maybe that's all one would need for a token.

+1

rob-metalinkage commented 5 years ago

A lot of existing systems have a compatible concept of profiles using a "server scoped" profile token - eg one of the W3C publishing process helper services takes an argument profile=FPWD which determines how it renders the content...

So the token allows us to retrofit URI identifiers to either locally or globally scoped profile identification tokens and generate a canonical metadata graph for these profiles already in the wild. And use of tokens is still likely to be preferable when using URLs instead of headers to navigate using alternative profiles.

nicholascar commented 5 years ago

Tokens for identifying profiles are seen in use in APIs such as Epimorphic's ELDA and CSIRO's pyLDAPI, e.g.:

So in the first case, the profile (view) token used is 'alternates', in the second its 'dcat'. There may be no URI equivalent for the first (it's listing of available profiles and, unless defined with a standard, may be replaced with mechanics specified in Profiles Conneg) clearly however the second use, of dcat, is equivalent to profile identification via DCAT namespace URI.

Yes, it would be an option to URL encode the DCAT namespace URI yielding http://linked.data.gov.au/dataset/gnaf?_view=http%3A%2F%2Fwww.w3.org%2Fns%2Fdcat%23&_format=text/turtle but this is ugly and long so we've not been using any forms of URIs for tokens.

All we are really dealing with is the equivalent of @prefix dcat: <http://www.w3.org/ns/dcat#> for convenience of use. PROF currently provides for this - you don't have to use it if you don't want to! - so Issue's question is whether adms:identifier can describe identifiers like this, rather than PROF's toke property.

agreiner commented 5 years ago

My concern with this is that people may use the token as a unique identifier, but there is no registry to ensure uniqueness, and if we are to consider the use of whatever already existing tokens people have used for a thing they call a profile, uniqueness is already lost. Why include it in a standard then?

rob-metalinkage commented 5 years ago

in ther case of the view alternates - its grounded in the semantics of the IANA registry..

there wont be one uber registry - so it really is like the prefix case - there is utility in developing some de facto standards and a registry or catalogs can be developed

On Wed, 9 Jan 2019 at 11:45, Annette Greiner notifications@github.com wrote:

My concern with this is that people may use the token as a unique identifier, but there is no registry to ensure uniqueness, and if we are to consider the use of whatever already existing tokens people have used for a thing they call a profile, uniqueness is already lost. Why include it in a standard then?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3c/dxwg/issues/453#issuecomment-452516731, or mute the thread https://github.com/notifications/unsubscribe-auth/AIR3YWz04h6C2legajq9b1oMOsZyK0-9ks5vBTu3gaJpZM4XMrlj .

nicholascar commented 5 years ago

Why include it in a standard then?

Because they are both already in use and are needed going forward. We exclude a series of profile access API styles if we don't include tokens (RESTful APIs) and if we don't define rules, we'll just have the wild West of implementations.

Many Semantic Web data models, predicated on URIs, have provision for alternate, local, non-authoritative identifiers, like DC, ADMS etc. This argument's just run recently in DCAT land too. Best position seems to be to define alternate identifier mechanics.

If this ontology's used at all then users will necessarily be in URI land. the catering for non-URI IDs will make some mechanics simpler and broader modelling of existing profiles simpler.

andrea-perego commented 5 years ago

@nicholascar , @rob-metalinkage , I still don't understand how tokens of this type (i.e., namespace prefixes), can be used as global identifiers, and not as something that is relative to the service / API.

I can buy a usage scenario where (a) I ask a service to list the available profiles, (b) I get the list of URIs and corresponding tokens/prefixes, and (c) I make a query using the token mapping to a given profile URI. But in this case the token cannot be included in the definition of a profile, as it may vary from service to service.

BTW, I would not bank at all on the fact that we have de facto standard namespace prefixes. One example is dc: / dct: / dcterms: which are all used for http://purl.org/dc/terms/ (where dc: is also used for http://purl.org/dc/elements/1.1/). And another one is the use of vcard: and vcard2006: for http://www.w3.org/2006/vcard/ns# (where vcard: is also used for http://www.w3.org/2001/vcard-rdf/3.0#). And finally geo:, used for GeoSPARQL, although it was already widely used for the W3C Basic Geo vocabulary.

kcoyle commented 5 years ago

APIs are a private case of negotiation; conneg is defining a public case. I agree with @agreiner and @andrea-perego that using anything but an actual URI as an identifier in "web space" is going to break "web things".

aisaac commented 5 years ago

Could it be that the side discussion on namespace prefixes abbreviations gives a hint on how to handle the case of token? Standardizing abbreviations seems like an ill attempt because in a local context (and abbreviations are always locally (re-)declared) one may use any abbreviation for a given namespace and things would still work. However there is some value in trying to homogeneize the use of abbreviations (at least it does sound like a best practice), and in our related work that's what been attempted with properties like http://vocab.org/vann/#preferredNamespacePrefix

nicholascar commented 5 years ago

@andrea-perego & @aisaac: I think I agree that local context identifiers, like in an API, don't need global representation.

@rob-metalinkage: can you indicate requirements for a profile to have a token listed globally in a PROF RDF write-up of one, thus requiring a prof:token as opposed to a local scope token only?

nicholascar commented 5 years ago

@aisaac I've added a PROF- > VANN suggested mapping to https://github.com/w3c/dxwg/wiki/PROF-Alignments-and-crosswalks

This certainly seems to be the most direct, currently used, equivalent to hasToken.

aisaac commented 5 years ago

@nicholascar thanks for putting a placeholder to it. I guess the answer to the question on which relation should hold between the two properties will depend on the answers from @rob-metalinkage

andrea-perego commented 5 years ago

@nicholascar , @rob-metalinkage , I think the fact that this is a "preferred" token should be clarified in the property definition itself. Otherwise, readers will discover it only if they check the mapping with VANN - and they may be also confused, because of the current the definition of prof:hasToken - quoting:

A property for identifying this Profile for use in APIs

I would therefore suggest what follows:

makxdekkers commented 5 years ago

I am not sure how I closed the issue. It was certainly not my intention, and I can't even remember that I did this. I must have clicked the wrong button. By the way, I sent a message to the list (https://lists.w3.org/Archives/Public/public-dxwg-comments/2019Feb/0007.html), with a suggestion for definition and usage note:

Definition: “An alternative identifier for the Profile” Usage Note: “To be used when the Profile’s URI cannot be used, for example in APIs or in content negotiation.”

andrea-perego commented 5 years ago

Just re-opening it.

andrea-perego commented 5 years ago

@makxdekkers proposed:

Definition: “An alternative identifier for the Profile” Usage Note: “To be used when the Profile’s URI cannot be used, for example in APIs or in content negotiation.”

I'm happy with the proposal, but, as I said earlier in this thread, it is important to make it very clear in the definition and usage note (and also in the property name) that this is the "preferred" token/identifier to be used - in analogy with vann:preferredNamespacePrefix.

nicholascar commented 5 years ago

Closing the issue as we have decided that adms:identifier and the accompanying adms:Identifier can't replace the simple token statement for use in conneg and for legacy profiles without URIs.

aisaac commented 5 years ago

Before closing it would be nice to have @andrea-perego 's confirmation, whether he thinks the current wording for the new element alleviates his worry.

andrea-perego commented 5 years ago

Thanks, @aisaac .

I would recommend changing

Definition: | A preferred alternative identifier for the Profile

into:

Definition: | The preferred identifier for the Profile

I think here "alternative" is misplaced, as there is no mention in the spec about "other" / "primary" profile identifiers.

aisaac commented 5 years ago

Adding to what @andrea-perego said... I'm actually not convinced if the definition stops at "The preferred identifier for the Profile". This is really vague: is it more prefered than the URI? I would much prefer if the definition includes the context in which the token is prefer, which is currently in the usage note. I.e. having something like

"The preferred identifier for the Profile, in circumstances where its URI cannot be used, for example in API arguments or in content negotiation."

kcoyle commented 5 years ago

Agree about "preferred" - preferred to what? I'm not sure it is only used where a URI cannot be, just that there are folks who want to use something other than a URI identifier.

aisaac commented 5 years ago

Noting that in the PR to remove the reference to this issue in PROF @nicholascar has changed the definition to:

The preferred identifier for the Profile, in circumstances where its URI cannot be used.

and the usage note to:

A simple lexical form of identifier that may be accepted in some circumstances, such as API arguments or in content negotiation, to reference this profile. This is a “preferred term”, since alternative identifiers may be declared and used by any implementation

I believe these two together are good enough for alleviating latest concerns. I.e. it expresses that this is how a profile's publisher says the profile should be refered to, when the URI can't be used. And examples are given.

agreiner commented 5 years ago

Again, I think it would be better not to use the term identifier at all with regard to tokens. Otherwise, I think people will use them anywhere that the URI can be used. The preferred identifier is the URI. I think we should call a token a text string that can be used in content negotiation by query string argument, as the value designating a profile. The preferred token can be specified in the profile itself and can be listed alongside the URI in any listing of available profiles.

rob-metalinkage commented 5 years ago

@agreiner I think that is an editorial change we should be able to accommodate - the doc is frozen for now but we should attempt to reach an agreement here to make this wording change.

nicholascar commented 5 years ago

If new wording will suffice here, I will make a PR to show how the change will be implemented (there may be multiple places that need changing if the term "identifier" is not to be used) but will hold off requesting a merge of that PR until the review period freeze is over.

aisaac commented 5 years ago

@agreiner I understand the concern, but the name of the name of the property does not include "prefered", it is only in the definition where (now) it is said it is for when the URI can't be used. https://w3c.github.io/dxwg/prof/#Property:hasToken

Also, for me the token can be useful in other circumstances than Conneg, and it's good that the spec would leave it also open a bit, when something can be used.

(and now maybe I'm going to touch on other aspects than conneg, I will not be offended if someone declares this completely off-scope)

For example tokens could be very useful in human-readable doc, to control how one mentions a profile there. It would be good if what people use in documents can be said to be an identifier, too, albeit within the scope of a (set of) document. And it's even better they use the same string as the one that's use in Conneg. This is the sort of trick that helps developers a lot when they implement something, isn't it?

There are certainly some risks, but in fact I believe the property as it is defined actually offers more benefits than risks with respect to controlling the proliferation of "identifiers" across the board. I.e. I believe the situation would be worse without the property as it stands.

And I have read the arguments about the lack of central authority/registry. So again yes there's a risk, but at least the properties allows the creator of the profile (if they're in control of the PROF metadata served at the URI for the profile) to have a authoritative say in how they prefer things to be done.

nicholascar commented 4 years ago

I've just checked the doc: "identifier": appears 4 times: 3 in relation to token in the hasToken property definition & usage note. In those two areas, qualification text such as "in circumstances where its URI cannot be used" is present, so there doesn't seem to be much point changing anything.