opengeospatial / SELFIE

Second Environmental Linked Feature Interoperability Experiment
https://opengeospatial.github.io/ELFIE
14 stars 8 forks source link

Testing plan for relating NIRs and IRs #19

Closed dblodgett-usgs closed 5 years ago

dblodgett-usgs commented 5 years ago

We need to flesh out the considerations and issues related to the relationship between non-information-resources (NIRs) and information resources (IRs).

This includes things like:

  1. what do search engines do with NIRs when they get a 303 to an IR at a different URL? This query indicates they index the NIR. https://www.google.com/search?q=Richelieu+aquifer
  2. What happens when people use the IR URL as an ID for the NIR and things get mixed up? Proposal is to just say that the IR and NIR are owl:sameAs so they get interpreted as the same thing.
  3. What happens when they are the same URL because the same group minted the ID and maintain the index about that ID and don't want to have the complexity of an NIR redirect?
  4. Add other considerations as this gets fleshed out.
alpha-beta-soup commented 5 years ago

Perhaps the resource/data identified by the IR URI should establish that <NIR URI> rdfs:isDefinedBy <IR URI>? There's an explicit example of that here: https://www.w3.org/TR/cooluris/#linking

All the URIs related to a single real-world object—resource identifier, RDF document URL, HTML document URL—should also be explicitly linked with each other to help information consumers understand their relation.

(Emphasis mine.)

rdfs:isDefinedBy is a subclass of rdfs:sameAs; but rdfs:sameAs seems too symmetrical and too weak.

abhritchie commented 5 years ago

Actually, owl:sameAs is very strong and does exactly what we want it to do: thing 1 is the same as thing 2 and all statements made about thing 1 also apply to thing 2 and vice versa.

abhritchie commented 5 years ago

I think we might be missing the obvious here too. With the specific example that kicked this issue off, dereferencing the NIR URI and getting an 'info page' IR back the @id value in the JSONLD returned or embedded in the response should be the URI of the NIR, therefore the same as the owl:sameAs we are proposing. Recall the info page is a convenience resource to support search engine indexing and present paths to related 'data' resources. It's not a (owl:) thing in its own right.

abhritchie commented 5 years ago

I think SIRF is on the right track by treating the alternates page (equiv to the info page) as a view of the NIR.

jvanulde commented 5 years ago

I'm leaning towards using rdfs:isDefinedBy rather than rdfs:sameAs as @alpha-beta-soup suggested. The problem with overloading rdfs:sameAs to describe the relation between a NIR and IR is that a consumer (e.g. harvester) can't discern between an actual relation between resources and a NIR/IR relation.

abhritchie commented 5 years ago

Fair call WRT use of owl:sameAs - getting a bit caught up with Boyan's suggestion in the original conversation that triggered this issue and its focus on the info page. For NIR to IR rdfs:isDefinedBy is good (note though that it is a subproperty of rdfs:seeAlso, not owl:sameAs), but ... said it a few times now but the idea of an IR is too broad. We have at least two types of IR, the 'info' IR (IIR) and 'data' IR (DIR). NIR rdfs:isDefinedBy DIR - makes sense and I image the RDF in an IIR would contain these triples NIR rdfs:isDefinedBy IIR - I need convincing the info page/resource is a definitive resource

dblodgett-usgs commented 5 years ago

Based on conversation in Quebec City -- we concluded that whether we use 303 redirects between NIRs and MIRs doesn't make much of a difference. We will recommend keeping conceptual separation (separate URIs) between the NIR and the MIR and plan on testing the case where there is no separation.

We will recognize the issues inherent in indirect identification (same URI) and recommend that the two URIs are distinct.

The second issue -- what should a reasoner do in relation to the NIR when it encounters the MIR URI? Should we publish a set of owl:sameAs relations between our NIRs and MIRs? -- no. We will not implement the sameAs relation or any other relation to relate the NIR and MIR.

This should be a consideration for the HTML version of the MIR. #16 is relevant here.

dblodgett-usgs commented 5 years ago

This issue can be closed when the above conclusion has been drafted into engineering report form.

dblodgett-usgs commented 5 years ago

How's this read?

The SELFIE resource model, with its non-information, meta-information, and data tiers, is focused on the nature of the resource and not the structure of the identifier used to dereference the resource. However, there is a need to establish expected behavior when dereferencing URIs for information at each of these three tiers. The SELFIE project tested two behaviors between NIR and MR:

  1. A 303 redirect from a URI for a NIR to the URL for an MR
  2. Indirect identification of the NIR with the URL that locates an MIR with a 200 response code.

The distinction between a NIRs and MRs located with different URIs introduces the potential for confusion. Indexes of MR content might get associated with the MR URL rather than an NIR URI the MIR documents. MR URLs might get used as NIRs because of accidental use of the URL returned by the 303 redirect. SELFIE has tested the former by exposing pilot infrastructure to the public internet and found that, as long as statements in the MR reference the NIR URI, search engines and indexers will work appropriately. The second issue can be mitigated by placing prominent NIR URIs on the html representation of an MR page and structure MR URLs in such a way that it is clear that they are not meant to be used as identifiers.

Indirect identification of NIRs using MR URLs has been tested by SELFIE participants successfully. It was found that this practice limits flexibility of systems to accommodate multiple agencies and the tight coupling of NIR to MR introduces technical challenges for MR resolution. However, the practice is less complicated from a user perspective and, as long as the NIR URIs are stable, is compatible with future decoupling of NIR and MR URIs when the separation is needed. These two cases, 303 and 200 response codes to a NIR URI, illustrate the flexibility of the SELFIE resource model as well as the nature of an MR as a convenient resource that encodes a set of statements relating NIRs to each other and data resources.

jvanulde commented 5 years ago

Based on conversation in Quebec City -- we concluded that whether we use 303 redirects between NIRs and MIRs doesn't make much of a difference. We will recommend keeping conceptual separation (separate URIs) between the NIR and the MIR and plan on testing the case where there is no separation.

We will recognize the issues inherent in indirect identification (same URI) and recommend that the two URIs are distinct.

The second issue -- what should a reasoner do in relation to the NIR when it encounters the MIR URI? Should we publish a set of owl:sameAs relations between our NIRs and MIRs? -- no. We will not implement the sameAs relation or any other relation to relate the NIR and MIR.

This should be a consideration for the HTML version of the MIR. #16 is relevant here.

I recall that we discussed using an about relation to link the MIR to the NIR...

dblodgett-usgs commented 5 years ago

Since about is the inverse of schema:subjectOf, I guess we would just add the MR URL to the schema:subjectOf set of relations? Going to put that in the example after discussion with @abhritchie

dblodgett-usgs commented 4 years ago

Just copied and rewrote the text in this issue here. https://docs.google.com/document/d/1b-TdqWl3jf2jJD7dYvLkxTa6JEGxtUJAue7jSiaEE2s/edit#bookmark=id.hc1p9qsw7zga