w3c / json-ld-syntax

JSON-LD 1.1 Specification
https://w3c.github.io/json-ld-syntax/
Other
116 stars 23 forks source link

how to "retype" rdf:JSON to geo:geoJSONLiteral? #425

Open VladimirAlexiev opened 10 months ago

VladimirAlexiev commented 10 months ago

https://w3c.github.io/json-ld-syntax/#json-literals shows how a piece of the JSON can be captured as datatyped literal "..."^^rdf:JSON

@situx @nicholascar @mathib @dr-shorthair https://github.com/opengeospatial/ogc-geosparql/issues/1 discusses the new GeoSPARQL 1.1 geo:geoJSONLiteral. According to the standard, such geometry serialization must both be connected by a specific property, and carry a specific datatype:

<geometry> geo:asGeoJSONLiteral "..."^^geo:geoJSONLiteral

@gkellogg Is it possible to somehow "retype" a rdf:JSON literal to a specific datatype geo:geoJSONLiteral? It's easy enough to do this post-factum (eg with a SPARQL update), but seems unsatisfactory...

Perhaps something like this can be added:

"@context": {
    "@version": 1.1,
    "geo": "http://www.opengis.net/ont/geosparql#",
    "myGeom": {"@id": "geo:asGeoJSONLiteral", "@type": "@json", "@json": "geo:geoJSONLiteral"}
}

Notice the last key: this is similar to "@nest": "labels" in https://w3c.github.io/json-ld-syntax/#example-defining-property-nesting

TallTed commented 8 months ago

It seems dangerous to retype all encountered rdf:JSON to geo:geoJSONLiteral, except within some specific dataset or data interaction.

I'm struggling to think of why you might need this to have happened, at any time other than during a query, at which point SPARQL/GeoSPARQL always lets you coerce/cast the datatype you want (assuming it's a "natural" transformation, which would be the case if geo:geoJSONLiteral is a subtype of rdf:JSON, as it appears)...

Would that not be sufficient?

gkellogg commented 8 months ago

This issue was discussed in a meeting.

Issue w3c/json-ld-syntax#425
https://github.com/w3c/json-ld-syntax/issues/425 -> Issue 425 how to "retype" rdf:JSON to geo:geoJSONLiteral? (by VladimirAlexiev)
Niklas Lindström: No strong feelings about this, but my reaction is that I've considered something like this and could be important to do, but not in JSON-LD.
... Whatever we might come up with could have some repercussions on how to deal with literals in RDF.
Gregg Kellogg: I think this is narrowly defined on specifying the datatype of a JSON literal.
Niklas Lindström: If properties are intrinsic in the data-space of the datatype, you could think of what it entails.
... It reminds me of the direction thing for language literals.
... My hope would be that it could serve a general purpose for RDF usage.
David I. Lehn: When I looked at this, I saw it in the RDF domain. Could it be done at the application level?
... Is it a general solution?
Pierre-Antoine Champin: What you said about the asymmetry is interesting. There are corner cases where JSON-LD fails to do proper coercion.
Pierre-Antoine Champin: "P": 3.14
... For example JSON numbers.
Pierre-Antoine Champin: "314E-2"^^xsd:decimal
... Things are not as smooth as you describe.
... The only robust way to get a literal of any type is to use strings. You get into trouble using numbers, and you could using JSON values, as well.
... Not sure it's a bug; I'd lean towards application level.
Benjamin Young: I think this is severe scope creep; if we add geoJSONliteral, it gets nuts.
... We could get data peppered with all sorts of different kinds of literals. Ultimately, if it's a string, you can parse it into JSON.
... Better stated using properties within the resulting graph, than by expecting that the content of the literal has some additional meaning.
Gregg Kellogg: TAG thinks polyglot formats are an anti-pattern.
Benjamin Young: This could lead to a proliferation of datatypes that require special knowledge to understand.
Gregg Kellogg: Propose closing.
David I. Lehn: This may beyond the scope of the group, but it feels like structured suffixes of media-types.
... Not sure how to represent that in RDF.
Pierre-Antoine Champin: Geo:geoJSONLiteral rdfs:subClassOf rdf:JSON
Ted Thibodeau Jr.: It would be a sub-property.
Gregg Kellogg: You would need to parse a string value and parse to JSON to work with it in the application layer.
Q
Niklas Lindström: I agree with the use cases and pragmatic within the JSON-LD context, but it opens up several strange things.
... The real solution is to use RDF to describe things with properties.
Ted Thibodeau Jr.: Or probably better ... "blah"^^https://www.iana.org/assignments/media-types/application/geo+json
... In practice we use structured values. Theory and practice often collide.
... Not convinced to close it, but can see why we might want to defer to the application layer.
Benjamin Young: It's a heavy topic that we should continue to discuss. I agree it should be in the application layer.
... The reason we did this is because of query considerations, and a query engine might be expected to introspect these values.
... From GeoSPARQL, you want to be able to find GeoJSON literals.
... This requires that the application do more work of the database.
... This requires vocabularies to describe the relationship between GeoJSON and RDF.
... What's being asked for is a way of doing magic-string labeling, which is what media-types are.
... A better solution would be to figure out how to bring in media-types and then we could leverage this.
... But, I don't just want to add new terms to do a surface solution that doesn't really solve the problem.
VladimirAlexiev commented 8 months ago

To Benjamin Young (Github finds 23 users of that name so I don't know whom to ping):

To @gkellogg

To @TallTed:

Notes:

In summary: I don't think it's appropriate for JSON-LD to question other standard group's decisions on what to keep as literals and what to break down into triples. JSON-LD already has key @json to emit rdf:JSON literals: I'm only asking for a way to attach a more specific datatype.

Cheers!

TallTed commented 8 months ago

@VladimirAlexiev

To Benjamin Young (Github finds 23 users of that name so I don't know whom to ping):

I believe you're looking for @BigBlueHat

To @TallTed:

  • Or probably better "blah"^^https://www.iana.org/assignments/media-types/application/geo+json
    • It might have been better, but GeoSPARQL 1.1 has specified geo:geoJSONLiteral (and geo:wktLiteral 12 years earlier)
    • Your own GeoSPARQL Benchmark uses wktLiteral, gmlLiteral, so why the sudden resistance to geoJSONLiteral?

That GeoSPARQL Benchmark belongs to OpenLink Software, my employer; it is not "[my] own".

"Sudden resistance"? Conversation sparks thoughts, which get "voiced" in IRC. I think you're reading more into those thoughts/comments than was meant.

Also, it appears that you may have misread my "probably better" to have been saying that https://www.iana.org/assignments/media-types/application/geo+json would probably be better than geo:geoJSONLiteral, when I was saying that https://www.iana.org/assignments/media-types/application/geo+json would probably be better than my previously offered https://www.w3.org/ns/iana/media-types/application/geo+json.

Still, I see no problem with raising the possibility of https://www.iana.org/assignments/media-types/application/geo+json (or https://www.w3.org/ns/iana/media-types/application/geo+json) as a datatype, which might be defined to be equivalent to geo:geoJSONLiteral.

Interestingly, RFC7946, The GeoJSON Format, which has no "obsoleted by", does not contain the string geoJSONLiteral, so it's not clear to me that "GeoSPARQL 1.1 has specified geo:geoJSONLiteral". There are references to external documents, but these lack dereferenceable URIs, not even pointing at documents behind paywalls, so it's difficult if not impossible for new readers to learn more.

In summary: I don't think it's appropriate for JSON-LD to question other standard group's decisions on what to keep as literals and what to break down into triples.

I don't think it's appropriate for anyone to tell me (nor the JSON-LD WG) that I (we) cannot break some literal(s) into triples. Whether that anyone chooses to consume or otherwise make use of the triples I (we) have derived from those literals is up to them.

JSON-LD already has key @json to emit rdf:JSON literals: I'm only asking for a way to attach a more specific datatype.

I think this has already been answered. In general, "key @json [should be left] to emit rdf:JSON literals". In your specific document(s), where you know that all now generic rdf:JSON literals are really geo:geoJSONLiteral literals, you can of course have your own @context which maps @json to emit geo:geoJSONLiteral literals. But this should not be a general/global/universal @context declaration! That's part of the point of @context — its content is context specific.

VladimirAlexiev commented 8 months ago

@TallTed GeoJSON is not obsoleted by GeoSPARQL: it is incorporated into it. See https://docs.ogc.org/is/22-047r1/22-047r1.html#10-8-3-1-%C2%A0-rdfs-datatype-geo-geojsonliteral (By the same token, GML is not obsoleted by GeoSPARQL: it is incorporated as another geometry serialization format.)

I don't think it's appropriate for anyone to tell me (nor the JSON-LD WG) that I (we) cannot break some literal(s) into triples.

This approach will break established implementation practices (semantic repositories that follow the GeoSPARQL standard to handle geometries). What would be the benefit of doing that?

you can of course have your own @context which maps @json to emit geo:geoJSONLiteral literals.

How? That's exactly the possibility I'm asking for.

TallTed commented 8 months ago

Anyone can mint their own @context document to reference at the top of their own JSON-LD documents. As I understand your described wish, which is simply type coercion, I believe you could put this at the start of your GeoJSON documents —

"@context": {
    "geo": "http://www.opengis.net/ont/geosparql#",
    "myGeom": {"@id": "geo:asGeoJSONLiteral", "@type": "geo:geoJSONLiteral"}
}

— or perhaps better —

"@context": {
    "myGeom": {"@id": "http://www.opengis.net/ont/geosparql#asGeoJSONLiteral", 
               "@type": "http://www.opengis.net/ont/geosparql#geoJSONLiteral"}
}

Note that "@version": 1.1, as was seen in your initial example, is not needed.

Although... I can find nothing that discusses the geo:asGeoJSONLiteral property you say is required to be used. http://www.opengis.net/ont/geosparql#asGeoJSONLiteral does not dereference; there is no asGeoJSONLiteral fragment in the documents returned from http://www.opengis.net/ont/geosparql.

You may need to remove or edit any existing @context declaration to remove some existing mapping(s), particularly if those existing @context include any @protected settings.

VladimirAlexiev commented 8 months ago

Hi @TallTed!

Please read https://w3c.github.io/json-ld-syntax/#json-literals. You have to state "@type": "@json" if you want a fragment of JSON to be captured into a literal.

Let's try at the playground: https://tinyurl.com/287gzb57

{
  "@context": {
    "@version": 1.1,
    "geo": "http://www.opengis.net/ont/geosparql#",
    "asGeoJSON": {"@id": "geo:asGeoJSON", "@type": "@json"}
  },
  "asGeoJSON": [1,2,3]
}

returns what I want, except for the generic datatype:

_:b0 <http://www.opengis.net/ont/geosparql#asGeoJSON> "[1,2,3]"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#JSON> .

If I follow your suggestion:

{
  "@context": {
    "@version": 1.1,
    "geo": "http://www.opengis.net/ont/geosparql#",
    "asGeoJSON": {"@id": "geo:asGeoJSON", "@type": "geo:geoJSONLiteral"}
  },
  "asGeoJSON": [1,2,3]
}

then the geometry is broken into individual triples, something I don't want:

_:b0 <http://www.opengis.net/ont/geosparql#asGeoJSON> "1"^^<http://www.opengis.net/ont/geosparql#geoJSONLiteral> .
_:b0 <http://www.opengis.net/ont/geosparql#asGeoJSON> "2"^^<http://www.opengis.net/ont/geosparql#geoJSONLiteral> .
_:b0 <http://www.opengis.net/ont/geosparql#asGeoJSON> "3"^^<http://www.opengis.net/ont/geosparql#geoJSONLiteral> .

If the input JSON has a string that looks like a JSON fragment, then sure I can coerce it to a specific datatype:

{
  "@context": {
    "@version": 1.1,
    "geo": "http://www.opengis.net/ont/geosparql#",
    "asGeoJSON": {"@id": "geo:asGeoJSON", "@type": "geo:geoJSONLiteral"}
  },
  "asGeoJSON": "[1,2,3]"
}

But the input files that I work with (JSON-FG, CityJSON etc) are all JSON, they don't use strings looking like JSON.

Now read my original proposal: I want a new key @json (mimicking the existing value @json) so I can retype to a specific datatype:

"asGeoJSON": {"@id": "geo:asGeoJSON", "@type": "@json", "@json": "geo:geoJSONLiteral"}

Is it more clear now?


There is no asGeoJSONLiteral fragment in the documents returned from http://www.opengis.net/ont/geosparql.

Sorry, my mistake, here's the correct prop pointing to the correct datatype:

:asGeoJSON
    a rdf:Property, owl:DatatypeProperty ;
    rdfs:subPropertyOf :hasSerialization;
    rdfs:domain :Geometry ;
    rdfs:range :geoJSONLiteral ;
BigBlueHat commented 6 months ago
"asGeoJSON": {"@id": "geo:asGeoJSON", "@type": "@json", "@json": "geo:geoJSONLiteral"}

This gets close to what's now in my head after reading all the defense/feedback, though with one key change (for a future version of JSON-LD).

{
  "@context": {
    "asGeoJSON": {"@id": "geo:asGeoJSON", "@type": "geo:geoJSONLiteral", "@json": "object"}
  },
  "asGeoJSON": { /* raw geojson */ }
}

resulting in:

_:b0 <geo:asGeoJSON> "{ /* raw geojson */ }"^^<geo:geoJSONLiteral>

That proposed format could potentially allow us to more clearly unconflate RDF datatypes and JSON value/format types. The proposed @json key could be used to let the processor know to encode the JSON object as a value for RDF processors and @type could then be focused solely on the RDF datatype.

Alternatively, the JSON object storage (vs. processing as more JSON-LD) could be done automatically when objects are detected--though this could result in unexpected processing output if/when errors occurred in the JSON...and undoubtedly makes the "is this bit JSON-LD or just JSON?" a harder question for authors to keep straight.

But...could result in this "just working":

{
  "@context": {
    "asGeoJSON": {"@id": "geo:asGeoJSON", "@type": "geo:geoJSONLiteral"}
  },
  "asGeoJSON": { /* raw geojson */ }
}

I'm certain there's issues with that reason that @gkellogg and others can elaborate on, but it may be interesting to explore if/when we could make datatypes more flexible...at least around JSON objects as values.

VladimirAlexiev commented 6 months ago

@BigBlueHat Thanks for supporting this! I like better "@type": "geo:geoJSONLiteral". But for the value of @json:

PS: when you take into account the prefix "geo": "http://www.opengis.net/ont/geosparql#", this results in prefixed URLs: geo:asGeoJSON "{ /* raw geojson */ }"^^geo:geoJSONLiteral.

The difference is crucial: there is in fact a geo URI scheme, and <geo:12.3,45.6> is a valid geo point, whereas <geo:asGeoJSON> is not a valid URI.

gkellogg commented 5 months ago

This relates to work on CDT literals proposed by AWS for SPARQL. Being about to use cdt:Map or cdt:List as the datatype of such a literal would directly play into this mechanism. See https://github.com/awslabs/SPARQL-CDTs/issues/3#issuecomment-2143547910.

VladimirAlexiev commented 5 months ago

If the need for such "retyping" has been established, can we consider concrete syntaxes to express it? Here are some choices:

1. "@type": "@json", "@json": "geo:geoJSONLiteral"
2. "@type": "geo:geoJSONLiteral", "@json": true
3. "@type": "geo:geoJSONLiteral", "@json": "@json"
4. "@type": ["@json", "geo:geoJSONLiteral"]

@gkellogg what do you think?

gkellogg commented 5 months ago

We probably need to think about other use cases that could use the same pattern. Really, the issue is when @type is used with a keyword and the resulting datatype IRI is implicit. Is there some corollary for YAML or CBOR that is not handled by @json?

Another possibility would be to add @json as a possible value for @container, which would treat the value as JSON but allow another datatype to be specified using @type. For example:

{
  "@context": {
    "asGeoJSON": {"@id": "geo:asGeoJSON", "@container": "@json", "@type": "geo:geoJSONLiteral"}
  },
  "asGeoJSON": { /* raw geojson */ }
}
TallTed commented 5 months ago

It may help others to know that the geo: URL scheme is specified by RFC 5870.

dr-shorthair commented 5 months ago

@TallTed have you seen geo: used in practice? I note that the registry [1] has not been touched since the RFC was issued.

[1] https://www.iana.org/assignments/geo-uri-parameters/geo-uri-parameters.xhtml#geo-uri-parameters-1

TallTed commented 5 months ago

@dr-shorthair — I was following up on https://github.com/w3c/json-ld-syntax/issues/425#issuecomment-2138882482 (which I should have cited). I have not personally seen RFC5870's geo: in use in the wild, but that means nothing; there are thousands if not millions of RDF users out there, and I don't see all their data.

VladimirAlexiev commented 1 month ago

@gkellogg I like "@container": "@json". As for extending into other "BLOB/CLOB" types (YAML, CBOR etc), I think this is a far future extension...