w3c / rdf-star

RDF-star specification
https://w3c.github.io/rdf-star/
Other
118 stars 23 forks source link

Self-referential resource URI for testing manifest documents is incorrect #269

Open jeswr opened 1 year ago

jeswr commented 1 year ago

The manifest documents for the test suites are located at URLs like:

https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.ttl

But in the manifest document they refer to themselves to using a fragment identifier e.g.

https://w3c.github.io/rdf-star/tests/trig/syntax#manifest

This results in problems in tools trying to use the manifest automatically, in particular https://github.com/rubensworks/rdf-test-suite.js/issues/77

My suggestion would be to use the relative URI <> to refer to the manifest within the document itself.

afs commented 1 year ago

Other tools have been processing the files successfully for awhile now. Could you explain more please?

Taking the TriG syntax tests as an example: if you mean:

trs:manifest  rdf:type mf:Manifest ;

There is more than one syntax file for the manifests - Turtle and JSON-LD. They are the same RDF graphs. Relative URIs would mean producing one from the other is not a simple syntax translation.

Tools can look for ??? rdf:type mf:Manifest.

jeswr commented 1 year ago

Rather than looking up all instances of type mf:Manifest, the rdf-test-suite is used by running the command

rdf-test-suite parser.js https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.ttl

Where the first argument parser.js is the JavaScript parser you are testing, and the second argument is the URI of the manifest document and it is expected that this is the same as the resource URI for the manifest file.

This tool works with a wide variety of tests suites including all standardized RDF 1.1 syntaxes; all of which use relative URIs in the publishing of their JSON-LD and Turtle manifests (e.g. https://w3c.github.io/rdf-tests/trig/manifest.ttl).

They are the same RDF graphs. Relative URIs would mean producing one from the other is not a simple syntax translation.

Whilst not necessarily being required by the definition, it seems to me that the convention has become that that an instance of mf:Manifest should be a machine readable document containing an RDF collection of manifest entries (rather than just any old rdf:Resource required to have a mf:entries property) as seen in the RDF 1.1 case described above; so at the very least https://w3c.github.io/rdf-star/tests/trig/syntax#manifest should dereference to a machine readable RDF document.

I understand the case that having relative URIs would result in more than one graph semantically, but I also don't think it would be semantically problematic to have several identifiers where

trs:manifest owl:sameAs https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.ttl
trs:manifest owl:sameAs https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.jsonld

My alternative (and perhaps cleanest) reccomendation would be that the webpage https://w3c.github.io/rdf-star/tests/trig/syntax/ should contain (using RDFa) the RDF data that is also contained in https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.ttl, so that when looking up https://w3c.github.io/rdf-star/tests/trig/syntax#manifest the relevant RDF data can be dereferenced.

TallTed commented 1 year ago

@jeswr -- You appear to have a misunderstanding of the meaning of owl:sameAs. This is not a predicate of "equivalence" or "similarity", but of co-reference. Your two statements of —

trs:manifest owl:sameAs https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.ttl
trs:manifest owl:sameAs https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.jsonld

— indicate that the same entity is identified by the three URIs —

trs:manifest 
https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.ttl
https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.jsonld

— as well as clearly implying that

https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.ttl owl:sameAs https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.jsonld

We do need to know what the trs: prefix maps to, in order to have full understanding of what you would be saying, but I hope it is clear that regardless of the expansion of that CURIe, the meaning of what you have written is not what you intended.

Also, please note that https://w3c.github.io/rdf-star/tests/trig/syntax/ is not a web page. At best, it is a shorthand which an HTTPS browser requests, in response to which an HTTPS server will probably redirect (visibly or not) to, and deliver the content of, https://w3c.github.io/rdf-star/tests/trig/syntax/index.html.

These comments may seem nitpicky and trivial, but for the topics being discussed here, they are absolutely vital to clear expression and comprehension.

gkellogg commented 1 year ago

The manifest at https://w3c.github.io/rdf-star/tests/index.html contains a rel=alternate relationship to the Turtle and JSON-LD versions, which should probably follow for the other HTML versions of the manifest.

IIRC, the trs:manifest entity was created to deal with the explicit serialization form represented by using <> or <manifest.ttl>, as @afs suggested, but which was the norm for other RDF test suites (mistakenly, IMO).

The JSON-LD test suite uses @base set to mainifest to do something similar to what trs: accomplishes, by removing the dependence on serialization.

For RDF-star, IIRC, we're concerned about the test suite moving, so that the reported manifest becomes disjoint with it's location. I think when the WG is formed, a space in w3.org should be mirrored, and context-negotiation used so that a fetch to (say) https://www.w3.org/2022/rdf-star/tests/trig/syntax/manifest would return one of the various forms (with some provision for HTML short of just including the JSON-LD manifest in a script tag), then it would be clear that the requested URL would be the same as the manifest subject, independent of the serialization used. But, with the CG. we don't have those tools available to us.

afs commented 1 year ago

Where the first argument parser.js is the JavaScript parser you are testing, and the second argument is the URI of the manifest document and it is expected that this is the same as the resource URI for the manifest file.

The link is line 46 of ManifestLoader.ts.

At lines 47-53 there is code to deal with RDFa. The same approach could be applied for RDF-star:

"/manifest.ttl" => "#"

Looking for rdf:type mf:Manifest would work in all cases, and be able to warn of there are unexpectedly two or more such resources.

jeswr commented 1 year ago

I'm finally coming back to this - and to me, there is still one key problem which is that in the root manifest (https://w3c.github.io/rdf-star/tests/manifest.ttl) we have the statement

<https://w3c.github.io/rdf-star/tests#manifest> mf:include (
      <nt/syntax/manifest.ttl>
      ...
    ) .

The tool https://github.com/rubensworks/rdf-test-suite.js/issues/77 expects that the entries in this list are resources of type mf:Manifest; which holds true on all the other test suites it supports. However, this is not true for the rdf-star test suite because the URI of the manifest resource is <https://w3c.github.io/rdf-star/tests/nt/syntax#manifest> not https://w3c.github.io/rdf-star/tests/nt/syntax/manifest.ttl.

What this also means is that if I were to load all of the RDF test suites into a single KG I would not be able to run the rdf-star test suites (whilst I expect I could run the others); because this test suite relies on the implicit semantics of the document structure in which it is published in order to define which resources correspond the manifest entries.

I appreciate that the WG is only now just kicking off so this may not be possible to address immediately, but it is a discussion that I would like to have at some point :)