Closed cmdoret closed 1 year ago
Hi @cmdoret thank you so much for opening a PR.
The way calamus currently handles it is clearly wrong. However, I'm not sure if the proposed solution is correct, either.
With the xsd datatype and without @id
, it's not an IRI reference (reference to a node) anymore, it becomes a string property with type coercion.
I.e. in ttl <https://datascience.ch> <http://schema.org/logo> "https://datascience.ch/wp-content/uploads/2019/04/logo_SDSC-300x82.png"^^<http://www.w3.org/2001/XMLSchema#anyURI> .
vs. <https://datascience.ch> <http://schema.org/logo> <https://datascience.ch/wp-content/uploads/2019/04/logo_SDSC-300x82.png> .
, so it loses the <...>
.
I think @id
is already a sort of native type in JSON-LD. So not adding an XSD type in IRI
fields might be more correct? Alternatively having two field types, one for IRI node references, one for IRI properties (which adds the xsd type) might fit better?
I couldn't really find any detail on if you could keep the @id
ness of the property, so to speak, while also adding an XSD type, maybe we could ask on the JSON-LD repo on what approach is correct?
It's certainly beyond me to judge if your proposed solution is identical to the output produced without add_value_types
or if having it as a string, without <...>
is meaningfully different (from a JSON-LD processor/RDF standpoint).
Thanks @Panaetius, I just had a chat with @rmfranken about this and it looks like there's no obvious way to represent xsd:anyURI
in json-ld. However json-ld.org uses "@type": "@id" in the context of type coercion:
{
"@context":
{
...
"homepage":
{
"@id": "http://schema.org/homepage",
"@type": "@id"
}
...
}
...
"homepage": "http://manu.sporny.org/",
...
}
This approach seems to work in our case as well:
{
"@id": "https://datascience.ch",
"http://schema.org/logo": {
"@value": "https://datascience.ch/wp-content/uploads/2019/04/logo_SDSC-300x82.png",
"@type": "@id"
},
"@type": [
"http://schema.org/Organization"
]
}
Converted to the following ttl:
@prefix schema: <http://schema.org/>.
<https://datascience.ch>
a schema:Organization;
schema:logo
<https://datascience.ch/wp-content/uploads/2018/04/logo_SDSC-300x82.png>.
Which is effectively the same as just omitting "@type". I see two potential solutions:
add_value_types=True
I would lean towards the latter because with the conversion sequence json-ld > ttl > json-ld
, it retains the same representation. By contrast, the first option loses the @type
. What do you think?
Below are all the turtle serializations we considered for an IRI:
"myURI.com"^^xsd:anyURI
:x: Not what we want (string that looks like a URI)
<myURI.com>
:heavy_check_mark: <myURI.com>^^xsd:anyURI
:x: Invalid turtle syntax<myURI.com> rdf:type xsd:anyURI
:x: Incorrect: rdf:type is not appropriate for xsd datatypesWhich can just be serialized as { "@id": "myURI.com" }
. This is what the last commit does:
{
"@id": "https://datascience.ch",
"http://schema.org/logo": {
"@id": "https://datascience.ch/wp-content/uploads/2019/04/logo_SDSC-300x82.png"
},
"@type": [
"http://schema.org/Organization"
]
}
If you think add_value_types=True
should explicitely add "@type": "@id"
, I can change it.
I think add_value_types
not doing anything on IRI fields makes sense :+1:
I had to change some settings to run tests on PRs created from forks. You can merge now :slightly_smiling_face:
In calamus 0.4.0, IRI are serialized as
xsd:string
when usingadd_value_types=True
. They should instead be serialized asxsd:anyURI
. This PR addresses the issue.Example:
Output before the PR:
Output after the PR: