w3c / dx-connegp

Content Negotiation by Profile
https://w3c.github.io/dx-connegp/connegp/
Other
6 stars 5 forks source link

Web browser navigation of profile information #14

Closed nicholascar closed 4 years ago

nicholascar commented 6 years ago

Use case: web browser navigation of profile information

Status: Proposed

Identifier:

Creator: Nicholas Car

Deliverable(s): Content Negotiation

Tags conneg profile

Stakeholders

data consumer

Problem statement

For human Linked Data data consumers, it is difficult to use HTTP headers such as Accept to perform content negotiation. Web browsers, the primary tool humans use to access Internet resources, do not lend themselves easily to HTTP header setting for non-advanced users.

In addition to any HTTP Content Negotitation-based approaches to navigate available profiles of information about a resource, we wish to provide a web-browser-friendly equivalent methods for humans.

Existing approaches

Alternates View

Established practice in Australian science agencies is to have a catalogue server that is able to respond to Query String Arguments (QSA) in place of, or even overriding, HTTP headers.

When using the Geoscience Australia Samples Catalogue, data consumers are able to ask for different profiles of metadata about samples using a _view QSA and different formats either using the HTTP Accept header or a _format QSA which, if present, will override the Accept header. For Sample AU239, some available profiles and the formats they are available in are:

Name Token Formats Namespace Description
PROV Ontology prov text/html, text/turtle, application/rdf+xml http://www.w3.org/ns/prov/ The W3C's provenance data model
SOSA Ontology sosa text/turtle, application/rdf+xml, application/rdf+json http://www.w3.org/ns/sosa/ The W3C's Sensor, Observation, Sample, and Actuator ontology within the Semantic Sensor Networks ontology
IGSN Schema 1 igsn-r1 text/xml http://schema.igsn.org/description/1.0 Version 1 of the official IGSN XML schema

Current Process:

  1. user follows URI to resource, e.g. Sample AU239: http://pid.geoscience.gov.au/sample/AU239
  2. user knows, by alternates View convention that appending _view=alternates to the URI will yield a list of the alternative view and formats for the resource, as per the table above: http://pid.geoscience.gov.au/sample/AU239?_view=alternates
  3. user selects a view and a format to their liking or relies on defaults. The view is indicated by a token, the token described by a Name, Description and Namespace, as perh teh table above
  4. user constructed a URI using _view & _format QSAs based on view token and format MIME type. For Sample AU239, the SOSA Ontology view uses token sosa and in turtle format uses MIME type text/turtle giving URI http://pid.geoscience.gov.au/sample/AU239?_view=sosa&_format=text/turtle

The existing approach does:

This existing approach does not:

OAI-PMH

The Open Archives Initiative Protocol for Metadata Harvesting is a well established protocol for requesting XML-based metadata for resources. It allows for profile listing via a QSA command paired with a resource identifier and for profile retrieval using a metadataPrefix QSA. Given that all responses are in XML, OAI-PMH implements profiles but not formats.

The same system delivering GA's Samples metadata as mentioned above can deliver OAI-PMH metadata too. Some example requests for the sample AU239 mentioned above also are:

URI Result
http://pid.geoscience.gov.au/samples/oai?verb=ListMetadataFormats&identifier=AU239 lists the metadata profiles available for resource with identifier AU239
http://pid.geoscience.gov.au/samples/oai?verb=GetRecord&identifier=AU239&metadataPrefix=oai_dc gets the metadata for resource AU239 according to the "oai_dc" profile (described in ListMetadataFormats)
http://pid.geoscience.gov.au/samples/oai?verb=GetRecord&identifier=AU239&metadataPrefix=igsn gets the metadata for resource AU239 according to the "igsn"

Other Implementations

Another implementations of profile negotiation that are similar to OAI-PMH include the Open Geospatial Consortium's Catalogue Service for the Web which uses only XML as the format but allows for QSA-based profile listings and selection.

Links

Systems implementing the existing approach:

Requirements

Provide a way, for each profile negotiation function enabled by an HTTP method, to trigger the same action in browser-friendly method

Related use cases

Requirement 6.5.2 Distribution schema [RDIS] Requirement 6.8.4 Profiles listing [RPFL] Requirement 6.8.2 Profile representation [RPFRP]

andrea-perego commented 6 years ago

Thanks, @nicholascar . I suggest you add UC30 as a related use case, which raises a similar issue, and by using as examples CSW and OAI-PMH endpoints (as in your UC). And you can also add requirement 6.8.3 Profile negotiation [RPFN].

larsgsvensson commented 6 years ago

The use case to allow for browser-based navigation of profile information is higly relevant. The existing implementation to use to access alternative views using query string arguments is one possible way of doing it, another is to add this information to your resource name (the last part of the URI) using dot-syntax (like having language variants of the same page and naming them html.en or html.sv which I think is the way the Apache content negotiation for Accept-Language works).

RubenVerborgh commented 6 years ago

Yeah, but OAI-MPH dates back to 2002 and has few to do with the Web. Talk to OAI-PMH co-author @hvdsomp and he'll be the first to tell you how OAI-PMH was not a Web-oriented way of thinking.

You can see a much more evolved line of thinking in the Memento spec (by the same authors), which like our work here also tackles content negotiation, and does not use any magic URIs.

I agree we should make it easy for people too, but let's just use the Web. If you want to point people to something, just link to it. Don't ask them to change the address bar, and don't mandate what the server's URI should be. And URI construction doesn's solve discovery, whereas linking does.

BTW In the case of Memento, there's also a browser extension that lets users do the negotiation.

RubenVerborgh commented 6 years ago

I think this is also why we should separate requirements and solutions. The requirement of human-based navigation is clear and I think we can all agree on that. But solutions are something different.

nicholascar commented 6 years ago

I hear you @RubenVerborgh: this is a Use Case just with examples of past practice for backgrounding information only. I'm keen to see newer/better practice established as long as it meets the human-based navigation requirement. So this is requirements before solutions.

aisaac commented 6 years ago

@nicholascar we have discussed your use case in the Profile Negotiation call and we're not sure everything is clear. At least for me ;-) I got some explanations from others (@RubenVerborgh and @rob-metalinkage) but I think the use case's wording should be clearly worded, especially on the questions:

  1. Is this case about (a) negotiating for data, i.e. to get data for a URI according to some profile, but not using accept headers, or (b) negotiating for profiles, i.e. to get a description of a profile in HTML, or both? If it's both, is there a prioritization that would help understand better? In our call, the others made me understand that it is chiefly about (a) but that for a user doing (a) they have to get some profile info and thus do be able to do (a). But this I think is not reflected by the title's emphasis on "navigation of profile information".

  2. Could there be more concrete example about what the negotiation without accept headers would look like? Again I understand that this is the object of the part on Query String Arguments (QSA) and _view, but it would be clearer if you'd show a concrete "data URI" being queried, with a concrete profile parameter (even if you make these URIs up, at least one would see the structure of the solution).

nicholascar commented 6 years ago

Regarding:

  1. both. Regarding the wording, it could be changed to: "manual discovery of profiles". It's exactly the same purpose as the HTTP spec proposal but to be carried out manually, by people, not automatically.
  2. I've expanded on the process of accessing profiles for the example in the "Alternates Views" section above with a "Current Process" set of steps that a user can follow. The example resources are real and a series of Linked Data platforms in operation enable the same steps.
azaroth42 commented 6 years ago

+0.

So long as there isn't a requirement to implement this, I'm not going to lie down in the road against it. By which you can infer that I would be -1 if there's a MUST for implementation, and a very strong -1 if it overrides the web architecture preferred method of HTTP headers.

aisaac commented 6 years ago

I'm quite split. To me in theory the use case could be approved, requirement(s) derived from it, and then we decide to reject the requirements if we think they're too much at odds with web architecture. This is tedious, but this would document why we've not gone this way. And I think this can be useful, as others may claim that this use case corresponds to a desire they have. For one I would understand that people used to the OAI-PMH parameter could be tempted to push for this have this pattern still available in a more general setting. So with this in mind I would vote for saying that the use case is in scope.

On the other hand, our charter (https://www.w3.org/2017/dxwg/charter) dictates that content negotiation by application profile should be the only delivery option that we explore. So this would be a reason to rule out the part about using alternatives to negotiation. The one thing the charter does not explicitly reject would be the part about discovering the available profiles, i.e. the ListMetadataFormats parameter from OAI-PMH (https://www.openarchives.org/OAI/openarchivesprotocol.html#ListMetadataFormats). But without the other pattern to access data in certain profiles without conneg, this would be a bit moot.

In any case, @nicholascar has answered the questions I had. And the title could be indeed changed to reflect better the content of the use case. I had understood 'profile information' to be 'metadata describing the profile', i.e. ProfileDesc statements...

@larsgsvensson @RubenVerborgh @rob-metalinkage I have an action to call for your feedback about this case, now that it's been edited by Nick :-)

agreiner commented 6 years ago

I think this use case is extremely important and fully in scope. Our charter tells us to develop guidance in the use of profiles, and if we are going to be suggesting the use of content negotiation, I would suggest that item number one in the list of principles would be to also provide a means of profile discovery without content negotiation. I don't see that as at all at odds with web architecture. Conneg requires that the representations to be served have dereferenceable URIs anyway. We needn't (and IMHO shouldn't) specify the use of query strings.

One of my biggest concerns with profile negotiation is that the content of the underlying resource is different in the representations served by content negotiation. Clients are limited to an automated negotiation even though information to which the user has no access is relevant to the choice. For example, the same dataset distribution served for different profiles may offer more content with a profile that describes more things. The user may want the fuller set of data but would have no means of knowing that it exists. If we assume use only by profile-specific applications, then the application cannot do anything with the alternatives that it is configured not to accept, so the issue goes away. But I can easily imagine applications that are not limited to a single profile or even a short list of them. Even if they use * to accept all possible profiles, they only get one. They may get a list in the reply header, but there is no reasonable way to indicate the differences in content between them. Human consumers won't even get that, unless we address this issue.

rob-metalinkage commented 6 years ago

+1 @agreiner - and ProfileDesc (new name wanted) is a response to this concern, as there does not appear to be a viable alternative way of expressing without forcing use of a specific profile constraints language, which is unrealistic.

It may not be possible to define a single canonical means of profile discovery - but if profiles themselves are URIs and support dereferencing - then perhaps it is possible to state they SHOULD resolve to a canonical representation relevant to the implementation platform - i.e. Linked Data should use RDF, XML based protocols should use an XML schema equivalent, and MAY use profile negotiation to access alternative forms.

kcoyle commented 6 years ago

Thanks, @agreiner . First, the charter reads "guidance for the publication of profiles" and I've been trying to figure out - for my own self - what significance the word "publication" has there. I'm tending to assume that the purpose of publication is usability, so maybe we shouldn't focus overly much on that word. Can we interpret that as meaning How can one publish profiles so that they will be more usable? And does that intend something more than making them machine-actionable, or could it extend to a full profile "ecology"? @philarcher ?

You say: "For example, the same dataset distribution served for different profiles may offer more content with a profile that describes more things. The user may want the fuller set of data but would have no means of knowing that it exists." In some past conversations, I believe it was @larsgsvensson who stated that conneg would not itself solve that problem, and that the information may be held by either the requester or the server. On the other hand, @RubenVerborgh 's use case (UC ID3) illustrates making a selection based on properties in the profile:

"For example, a profile X could demand that all persons are described with the FOAF vocabulary, and a profile Y could demand that all books are described with the Schema.org vocabulary. Then, a response which uses FOAF for people and Schema.org for books, clearly conforms to both profiles. "

When you say: "The user may want the fuller set of data but would have no means of knowing that it exists" what do you think would be needed so that the user can discover what profiles exist? What kind of description would give users that information?

The follow-up, depending on the answer, is whether this is something we are prepared to tackle within our work.

(Note: we have a rather vague requirement that profiles be discoverable, but that hasn't been elaborated.)

agreiner commented 6 years ago

The simplest solution to this is for the publisher of the dataset to offer an html page with links to the different distributions, and human readable text summarizing the differences between them, something like the following:

The figglewood dataset is available in conformance to the following profiles:

snorbert (sleep apnea research) sippit (beer and wine tasting)

The sippit profile offers the same data as snorbert with the addition of taste evaluation descriptors.

azaroth42 commented 6 years ago

@agreiner I agree completely ... but is there any specification needed beyond a recommendation that such an HTML page be available?

akuckartz commented 6 years ago

but is there any specification needed beyond a recommendation that such an HTML page be available?

The HTML page is one representation of a resource. What about other representations? What is the resource?

agreiner commented 6 years ago

If we are limited to guidance, I don't think there's much else we can do anyway. But I think it's useful to keep the use case in mind when deciding how to address other requirements, to be sure we don't somehow preclude it. I also think there is a discussion to be had around incentives. I worry that use of conneg will incentivize publishing without providing human navigable profile info.

agreiner commented 6 years ago

@akuckartz, the resource is a human-readable description of the available distributions of a dataset. The representation is an html file, in English. It could have other representations, like any other resource on the web.

makxdekkers commented 6 years ago

@agreiner your definition of "a human-readable description of the available distributions of a dataset" is more or less what dcat:landingPage is often used for.

aisaac commented 6 years ago

All this sounds good. I'm a bit worried that all this discussion is made a bit more complex by the fact that the use case calls for two things (1. know which profiles are available for some data, 2. require one profile for the data using something else than conneg). And when we start discussing solutions it seems that we only address one of the aspect. But I now feel even more comfortable than I was earlier with accepting this use case.

larsgsvensson commented 6 years ago

@aisaac scripsit:

@larsgsvensson @RubenVerborgh @rob-metalinkage I have an action to call for your feedback about this case, now that it's been edited by Nick :-)

I find this UC highly relevant and we really should derive requirements from it. As Antoine says, it might be that we don't approve all requirements but then at least we'll know why...

@agreiner scripsit:

I worry that use of conneg will incentivize publishing without providing human navigable profile info.

Yes, might be. And there seems to be rough consensus that the Profile Guidance Document tell implementers that they SHOULD offer human-readable documentation on what profiles are available (possibly being different ones for different media types) and MAY tell the user how to get data adhering to those profiles without using conneg. Do I see that right?

An aside to MUST vs SHOULD: AFAIK, when making a SHOULD requirement it's good practice to give examples of circumstances when it's not necessary to implement the requirement (thus not making it a MUST). What could be considered so special circumstances that we don't mandate the existence of an HTML page?

larsgsvensson commented 6 years ago

@rob-metalinkage scripsit via email:

I think we have existing profiles where the description is a PDF or something - most OGC specifications are like this. SHOULD I agree - (and i intend to use profileDesc to create at least a minimal landing page with links to these docs for all such profiles).

Likewise IMHO profiles SHOULD have links to appropriate machine readable constraint specification - and SHOULD declare relationships to other profiles and links to such documents using a canonical description language (candidate is profileDesc). I dont think we can say MUST for any of these, but profile guidance can recommend them all strongly with SHOULD.

more contentious is whether, in the presence of multiple possible profiles to represent data we state whether header based Conneg is a MAY or SHOULD, and likewise for an alternates view - is that a SHOULD? (Ipersonally inclined to a SHOULD if conneg is supported)

anyway - lets get the Profile Guidance group into action and start discussing these things - we have requirements gradually being voted on in plenary we can start with, and we do not seem to be significantly changing the existing straw man set of requirements (only improving mutual understanding and wording) - so we should be able to start to flesh out a core scope for profile guidance.

Rob

kcoyle commented 6 years ago

I suspect we cannot use SHOULD with validation schemas until they are more accessible. SHACL is only available via TopBraid and ShEx doesn't have an installable app at all. I hope SHOULD is in our future, but I don't think we are there yet. XML schema exists but only for XML-defined metadata.

kcoyle commented 6 years ago

Hmmm. Thinking about the rule "mandatory if applicable" - maybe we can have "should if available".

aisaac commented 6 years ago

It's seems like we're past approving the case, and now discussing requirement and maybe even solution space ;-)

I can't resist the discussion however, and here's what I would support:

rob-metalinkage commented 5 years ago

QSA section included in document.

larsgsvensson commented 5 years ago

decided to untag "profile-negotiation" in conneg meeting 2019-03-13

aisaac commented 5 years ago

I am quite puzzled, why we would have to close an issue that represent a use case. What does it mean?

kcoyle commented 5 years ago

IF we consider this to be in scope, perhaps we need a label for deferred features, assuming that Conneg has decided not to address this at this time.

nicholascar commented 5 years ago

It is very much in scope and is, in fact, catered for in doc (see Rob: "QSA section included in document." above).

It was once un-tagged from "profile-negotiation" as we tried to work out how to cater for issues. Then re-tagged and tagged as profneg-due-for-closing to indicate that, from Conneg's point of view, work on this issue was complete.

Common GitHub Issue usage sees open Issues as things needing to be addressed. Since this has been addressed, it should be closed. It's fine to keep it as a record of the Use Case (and it will, of course, automatically be kept!) but it should be marked as closed.

nicholascar commented 5 years ago

Closing after listing in plenary 2019-09-03 + 3-day wait period.