Closed dblodgett-usgs closed 4 years ago
Just adding links: #33 #41
Slightly modded from a chat Bruce and I had yesterday:
My assumption, is that our JSON-LD data resource documents are RDF docs with contexts that map to OWL Ontologies or RDF schema. We therefore use @id and @type to identify things and bind them to RDF classes*. Data documents returned as JSON-LD should have been prepped accordingly**. The url/foaf:Document pattern is for meta resource docs where we link to documents that are stored elsewhere and may not be, in fact are most likely not, RDF docs … they could just be HTML pages, or GML or … Because of this high risk of non-RDF content (and the moral and philosophical problems it raises) we don’t use formal RDF semantics to bind to the linked document. For starters we wrap the link in a schema:subjectOf property (loose schema.org linking rather than strong rdf linking) and we kept the looseness going by proposed the “url” property, it’s just a link, not an HTTP URI, and foaf:primaryTopic property is a strong hint at the type of the target but not an explicit binding to an RDF Class (I understand this to be consistent with the soft/hard typing view Simon has expressed in a DXWG thread about dct:type).
* Meaning we have to create RDF/OWL ontologies for all our application schema. Meaning Abdel’s work is the most important right now. ** I you can’t serve data that can be converted into a legitimate RDF document then don’t use JSON-LD.
Thanks @abhritchie for the updates. I'm recording an email to the SELFIE list here for reference.
The SELFIE example documents have been updated and merged into the master branch.
https://github.com/opengeospatial/SELFIE/tree/master/docs/examples
The changes:
This document is what I believe to be a minimal example (derived from here) that demonstrates the use of "schema:url" : "http://url"
to reference formal RDF/JSON-LD data resources and more generally reference any data resource within a "schema:subjectOf"
block.
(Updated 9/23/19 based on @abhritchie comment below)
{
"@context": [
"https://opengeospatial.github.io/SELFIE/contexts/elf-index.jsonld",
"https://opengeospatial.github.io/SELFIE/contexts/elf-data.jsonld"
],
"@id": "https://lab.scinfo.org.nz/selfie/id/aquifersystem/richelieu",
"@type": "gw:GW_AquiferSystem",
"name": "Aquifer system : Richelieu",
"subjectOf": [
{
"url": "https://lab.scinfo.org.nz/selfie/data/aquifersystem/richelieu",
"@type": "foaf:Document",
"primaryTopic": "gw:GW_AquiferSystem",
"format": [
"text/ttl",
"application/ld+json"
]
},
{
"url": "https://lab.scinfo.org.nz/service/selfie/api/collections/data/items/aquifersystem-richelieu",
"@type": "foaf:Document",
"primaryTopic": "gw:GW_AquiferSystem",
"format": [
"application/vnd.geo+json",
"text/html"
]
}
]
}
@denevers @brodaric -- Can you verify that this is where we want to land?
@bsimons14 @amacleod-cerdi @abhritchie -- Can you provide some description of the nuances here?
I need to read a bit more just to be sure I understand the issue correctly.
If I understand this correctly based on Alistair's comments above, then we are using "@id" when the resource is in RDF and "url" when it is not. I am unclear why we need both. I can see a need for "url" but I don't know what benefits we get by also allowing "@id". Presumably it brings in the additional overhead that a client must be able to navigate both. I understood from the meeting summarised in #33 that we would only use "url". To quote Boyan and agreed to by Alistair: "it is simpler, neutral about the resource (data id vs api), supports the narrative, and allows (I think) the more compact data encoding favored by AR. On the downside, its less descriptive, but the enhanced description is likely needed for a small group of uses cases."
Oh bother. In your example Dave, and the source in the repo, the first object should have been removed when I cleaned everything up yesterday (update: I've updated the example). The first and second objects in the subjectOf
array point to the same resource. It is a hangover from having the two options sitting side by side. Based on what I'm now calling the Range-33 decision (because it sounds cool) the example should look like this:
{
"@context": [
"https://opengeospatial.github.io/SELFIE/contexts/elf-index.jsonld",
"https://opengeospatial.github.io/SELFIE/contexts/elf-data.jsonld"
],
"@id": "https://lab.scinfo.org.nz/selfie/id/aquifersystem/richelieu",
"@type": "gw:GW_AquiferSystem",
"name": "Aquifer system : Richelieu",
"subjectOf": [
{
"url": "https://lab.scinfo.org.nz/selfie/data/aquifersystem/richelieu",
"@type": "foaf:Document",
"primaryTopic": "gw:GW_AquiferSystem",
"format": [
"text/ttl",
"application/ld+json"
]
},
{
"url": "https://lab.scinfo.org.nz/service/selfie/api/collections/data/items/aquifersystem-richelieu",
"@type": "foaf:Document",
"primaryTopic": "gw:GW_AquiferSystem",
"format": [
"application/vnd.geo+json",
"text/html"
]
}
]
}
The other issue raised in #33 is: "WRT to using foaf:primaryTopic instead of dc:type." I don't think this was resolved there but was resolved in #41: To quote Alistair: "dct:type is being replaced by foaf:primaryTopic (see #33). This is used to strongly suggest the type of target resource described by the document located at the url - the target resource may not have an RDF representation - e.g. it could be a GML doc, or or HTML page". Simon Cox raised some issue with this, but did not specify what it was. If other SELFIE participants are happy to go along with Boyan's suggestion, then I suggest we agree to use it until shown the error of our ways.
I'm a bit reluctant to stir the pot here as I try to understand respective points. I just now realise another aspect of the url vs @id. the former is not necessary pointing to a RDF, the second must. GSIP did not make that distinction since content negotiation was doing just that. I initially understood url/@id as the former is a generic url, most likely an API call, don't even consider this as an identifier while the second is a resource and an identifier for something that can be conneg and resolved (so other datasets might use this URI to point to something). I saw "url" pattern as a way to access stuff without the overhead of content negotiation. ok. I don't have a horse in this race.
The bit I need explanation is @type (sorry). My understanding of @type was that it identify the type (in the schematic sense) of the document / snippet I'm looking at. Because if I pull a RDF version of GWML, the aquifer (encoding using GWML schema) will also be @type = gw:AquiferSystem. hence my confusion.. (and I don't imply we got it right in the current version of GSIP MIR)
help ?
caffeine kicking in..
to add to my message.. I would expect the MIR to have @type="selfie:mir"
...
I don't think we would ever have a pointer to the MIR (MR) directly so we wouldn't document its type in that way?
Not sure this helps, but @type
is just short for http://www.w3.org/1999/02/22-rdf-syntax-ns#type
and @id
is short for subject.
Here is the example flattened out at https://json-ld.org/playground/
<https://lab.scinfo.org.nz/selfie/id/aquifersystem/richelieu> <http://schema.org/name> "Aquifer system : Richelieu" .
<https://lab.scinfo.org.nz/selfie/id/aquifersystem/richelieu> <http://schema.org/subjectOf> _:b0 .
<https://lab.scinfo.org.nz/selfie/id/aquifersystem/richelieu> <http://schema.org/subjectOf> _:b1 .
<https://lab.scinfo.org.nz/selfie/id/aquifersystem/richelieu> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <gw:GW_AquiferSystem> .
_:b0 <http://purl.org/dc/terms/format> "application/ld+json" .
_:b0 <http://purl.org/dc/terms/format> "text/ttl" .
_:b0 <http://schema.org/url> "https://lab.scinfo.org.nz/selfie/data/aquifersystem/richelieu" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Document> .
_:b0 <http://xmlns.com/foaf/0.1/primaryTopic> "gw:GW_AquiferSystem" .
_:b1 <http://purl.org/dc/terms/format> "application/vnd.geo+json" .
_:b1 <http://purl.org/dc/terms/format> "text/html" .
_:b1 <http://schema.org/url> "https://lab.scinfo.org.nz/service/selfie/api/collections/data/items/aquifersystem-richelieu" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Document> .
_:b1 <http://xmlns.com/foaf/0.1/primaryTopic> "gw:GW_AquiferSystem" .
And this is where I get confused. For me @type is also an information for the system ingesting the document (the MIR) to figure the type of the node it is currently parsing (amongst other things)
Yeah, no. Since @id
is just a subject, what @type
is saying is:
@id
: http://www.w3.org/1999/02/22-rdf-syntax-ns#type
: @type
Sorry - you lost me
Another way of putting (or misrepresenting) what @dblodgett-usgs said is perhaps: we shouldn't have formally defined ELFIE resource types (elf:MetaResource; elf:DataResource). Personally, in the context of SELFIE, I think we are discovering we are over engineering things.
Personally, in the context of SELFIE, I think we are discovering we are over engineering things.
Good point. Maybe we should move on.
ELFIE started with a model of views on representations of environmental things (inspired by the UK Linked Data API, specifically the part about viewing resources). I say inspired by because we made no effort to reconcile what we did with an actual implementation using the LDA. Revisiting it, perhaps we should compare our thing with the LDA. It could give us a mechanism to implement (and define) what we want without overloading the \@identifying and \@typing of things (handling domain models and API resource models). MIR: https://lab.scinfo.org.nz/selfie/id/hydrogeounit/richelieu-1?_view=meta DIR: https://lab.scinfo.org.nz/selfie/id/hydrogeounit/richelieu-1?_view=data Where meta is the default view.
Note, however that ELFIEs should be striving for a low barrier of entry. So, again, we should be wary of over complicating things.
reformatted ttl from Dave's flattened version
<https://lab.scinfo.org.nz/selfie/id/aquifersystem/richelieu> <http://schema.org/name> "Aquifer system : Richelieu" ;
<http://schema.org/subjectOf>
[
<http://purl.org/dc/terms/format> "application/ld+json" , "text/ttl" ;
<http://schema.org/url> "https://lab.scinfo.org.nz/selfie/data/aquifersystem/richelieu";
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Document>;
<http://xmlns.com/foaf/0.1/primaryTopic> "gw:GW_AquiferSystem"
];
<http://schema.org/subjectOf>
[
<http://purl.org/dc/terms/format> "application/vnd.geo+json","text/html";
<http://schema.org/url> "https://lab.scinfo.org.nz/service/selfie/api/collections/data/items/aquifersystem-richelieu";
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Document>;
<http://xmlns.com/foaf/0.1/primaryTopic> "gw:GW_AquiferSystem"
];
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <gw:GW_AquiferSystem> .
Would be helpful to have one or more people read and summarize the tradeoffs of the issues brought up here in reference to the specifications of the properties in question. We seem to be at an impasse and we need to clarify the argument with data.
For consideration of the group:
After further conversation. The issues being discussed here are beyond the scope of what we can solve in this experiment. We have space in the ER outline to discuss them and suggest future work. Existing SELFIE experimental contexts and implementation stand and will be documented in the ER.
Consolidated contexts will be brought to a ELFIE outcomes context.
This will allow us to move forward with completion of the IE while recording the outcome of these conversations.
We need to recognize that there is a semantic web / rich linked data set of use cases that are outside the scope of the IE to be left for future work.
Decision has largely been made. We need clear examples that implement the outcome correctly. This issue should contain annotated syntactically correct examples that illustrate a resource model.
Here's the tier of the model described.
some discussion of its nuances