Closed denevers closed 4 years ago
In what sense does it not use a context, when there's a @context
property?
oh, there it is, at the very end.. Sorry, my bad. So is this ok ?
@denevers Can you please precise what do you mean by "conformant"? the content or the form? is this JSON-LD content supposed to be concerning one main resource?
From what I see it's a graph containing a lot of mixed things, from owl classes definitions to resources descriptions. Only two resources have related DIRs.
Can you please provide some details please :)
@abhritchie created a series of contexts that we should use to create MIR instances. I did not use (or looked) at any of them because I hope Jena RDF library can create them directly from the triple store.
I notice Jena wraps the document into a graph for instance. Does it matter ?
Ok I see. I think that the output needs some refining using the json-ld library. It provides a way to compact/format a jsonl-ld load by binding it to som contexts. I tried with JS, works fine. we should look it up with java.
Regarding the graph, I am afraid id does matter. I remember discussing that in ELFIE. By putting everything in a json-ld graph we lose the ELFIE pattern, i.e. first level describing the main entity + nested levels for describing related entities. In your example I see a lot of entities put on same first level since it's a graph. This is why I asked what NIR is this json-ld about.
In addition the output contains the OWL definition of some classes such as owl:Thing, HY_Catchment. I think in (S)ELFIE we decided to include only assertions about the data, and not about the ontologies. I notice also there are a lot of owl:sameAs relations that relate objects to themselfs (I don't understand that).
After discussion with @afeliachi and @sgrellet , I understand that the problem is we can't identify the resource this MIR is about. Because of the tree structure of JSON-LD, we can infer it's the resource at the root. We can't do this in TTL (and not garantee it in other encoding as well, such as N3 or even RDF/XML, although it would be possible to force it in RDF/XML). So, this solution only works for JSON-LD and RDF/XML.
In a typical move from myself to avoid unnecessary work, would it be a more robust solution to infer that the context resource (the one the MIR is about) is the resource that has a schema:subjectOf
.
This works for all encoding, and would also work if the MIR is about more that one resource (and an acid test to figure if the MIR is leggit - no subjectOf
, not a MIR).
The extra owl bit, I can remove easily
I notice also there are a lot of owl:sameAs relations that relate objects to themselfs (I don't understand that).
I'll check, but I think Jena is inference engine just concluded a sameAs means that resources are the same as themselves. Jena thinks we are dumb.
This works fine (I'm not a python programmer, don't judge)
import urllib
import rdflib
from rdflib import URIRef
EN_US = "This is not a MIR"
EN_CA = EN _US + ", eh"
FR_FR = "Ceci n'est pas un MIR -- Magritte" # ok.. he's Belgian. I know
FR_CA = "C'est pas un MIR, ta patente"
SCHEMAS_ORG = rdflib.Namespace("http://schema.org/")
TEST_FEATURE = "https://raw.githubusercontent.com/opengeospatial/SELFIE/master/docs/examples/meta-resource-2.geojson"
def getContextResource(g):
''' we assume there is at most one '''
for subjet,obj in g.subject_objects(SCHEMAS_ORG["subjectOf"]):
return subjet # return the first
return None
g=rdflib.Graph()
g.parse(TEST_FEATURE,format='json-ld')
context = getContextResource(g)
if context is None:
print(EN_US)
print(EN_CA)
print(FR_FR)
print(FR_CA)
else:
print(context)
Wait.. this does not work if there are no representations. Can a resource have no representation ?
Theoretically yes. We can identify a resource without creating any statements about it - say, minting a URI for a site that is to be visited. Arguably unwise and to be discouraged but not at all wrong.
Check out the ELFIE Context READ ME @denevers . It has a few guidelines for the structure/layout of (S)ELFIE JSON-LD documents. These will grow as we do more work.
To be honest, this is precisely the bit I wanted to avoid, hoping Jena would do the work for me.
ok.. there's hope : https://jena.apache.org/documentation/io/rdf-output.html#json-ld
@denevers Has this issue played its self out? Should we go ahead and close or represent anything from here in the ER?
Given no response and other threads we've opened up, I'm going to close this.
I've re-opened this issue due to a moment of (what I think to be) clarity on this.
I've been uncomfortable with the title of this issue since it was first opened. "Conformant" is the wrong way to think about what we are doing in SELFIE. We have some lightly modified ELFIE JSON-LD contexts which provide guidance. From the ELFIE ER (¶2 here emphasis added):
...the JSON-LD context was chosen as the most appropriate technical solution to express the graph views. The ELFIE contexts are meant to be precise descriptions of what the IE found to be the most appropriate linked data properties (attributes and relations) and types. They are not intended to be used as a schema to specify or validate the contents of a given document.
Given this, let's consider the dichotomy of representations that's been raised in #76. That is, (1) linked data confined to the <script>
header of one or more html representations and (2) linked data representations that are ostensibly part of a LoD graph.
This issue illustrates the difficulty of getting (2) to behave in a way that would be "nice" for (1). But if we treat these as two different use cases with different requirements that relate to each other then we are freed up to let the graph be the graph (and not behave) and hack the pretty JSON-LD together however is most appropriate.
From an implementation point of view, I think this could actually be pretty slick. If we have a (2) flavor representation that comes out of a triple store, and we have templating to html, we can add (1) content to a <script>
tag with templating from the full linked data graph the same as we would the rest of the html.
Using GSIP pages as an example,
https://geoconnex.ca/gsip/
has decided to expose through the "info" route of its API.https://geoconnex.ca/gsip/
thought would be useful to humans and link-hungry crawlers.Now the question I have is -- is there any reason not to include the full graph in the <script>
tag other than it getting pretty heavy for an HTML page load?
One perspective could be if you didn't want or didn't have schema.org content in your knowledge graph, you could template schema.org content into the html page?
Is there an argument that the content of that script
tag should be the same as what you get back from a json-ld
or ttl
representation of the same resource?
The original intent of this issue was more on the actual formatting on the JSON-LD more than content. I can generate JSON-LD from various libraries - I use Jena - and they should be "conformant" to JSON-LD. But, following some discussion with @abhritchie and @afeliachi , it seems there are extra rules in the formatting to "hint" a client of what is the context resource (the resource we are talking about). The context resource needs to be the "root" node in the JSON nesting. In fact, the same rule would need to have RDF/XML and TTL equivalents (XML encoding has nesting, but TTL is flat). So it seems we are really talking on a SELFIE (ELFIE) profile of RDF/JSON-LD
I understand that was the original intent, but I'm trying to get at a broader point that it brings up -- That formatting of the LD really only matters in the html <script>
tag. And even then, do we really know that it does?
Good - just to be sure we were not on cross conversations.
And even then, do we really know that it does?
When you get it during dereference, no because you already know what is the context resource, it's the one you just dereferenced. But maybe there are other use cases
It does not use contexts. This has been generated automatically by Jena