w3c / rdf-tests

Repository for the RDF Tests Community Group
w3c.github.io/rdf-tests
Other
44 stars 23 forks source link

Move of RDF tests breaks existing test runners #113

Closed RubenVerborgh closed 11 months ago

RubenVerborgh commented 1 year ago

It appears that https://github.com/w3c/rdf-tests/commit/96d54913e0d88c2cd28fdfd9dd65c34707d582c1 breaks existing test runners, such as for example here: https://github.com/rdfjs/N3.js/actions/runs/5822558945/job/15787872888:

✖ turtle-subm-01
  Blank subject
  Error: Invalid data parsing
  Input: @prefix : <#> .
[] :x :y .

  Expected: [
  {
    "subject": "_:b118_genid1",
    "predicate": "https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/turtle-subm-01.ttl#x",
    "object": "https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/turtle-subm-01.ttl#y",
    "graph": ""
  }
]

  Got: [
  {
    "subject": "_:n3-50",
    "predicate": "http://w3c.github.io/rdf-tests/turtle/turtle-subm-01.ttl#x",
    "object": "http://w3c.github.io/rdf-tests/turtle/turtle-subm-01.ttl#y",
    "graph": ""
  }
]

Perhaps the introduction of mf:assumedTestBase is supposed to help with that. However, that doesn't change the fact that:

I.e., I don't think that mf:assumedTestBase can/should retroactively change the meaning of mf:action and mf:result.

CC: @rubensworks

gkellogg commented 1 year ago

Note that we left symbolic links to the old location of the tests in place as a transition, but if you run from that point, you will get different results (unless using the mf:assumedTestBase as the base location.

Your test results are consistent with running the test from https://w3c.github.io/rdf-tests/turtle/turtle-subm-01.ttl, whereas the new location is https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/turtle-subm-01.ttl. This change in location was signaled in PR #104, and PR #111 is on track to remove the symlinks from the old locations, specifically so that issues like this will be noted by implementors. Note that the HTML renders of the manifests within this repo (e.g., http://w3c.github.io/rdf-tests/) are consistent with the new site structure.

Just to reiterate, the structure was initiated by the SPARQL tests carving out sparql10, sparql11, and sparql12 directories for tests from the relevant versions of SPARQL. The rdf11 and rdf12 directories are in place for RDF 1.1 and upcoming RDF 1.2 test suites.

The purpose of mf:assumedTestBase is to allow test runners some independence of the actual test locations, if this value is used to set the base URI for parsing the input files, which should be the same is if they are run from their existing HTTP locations. The exception if for RDF/XML, which needs to be worked out, as it uses further structure under rdf/rdf11/rdf-xml/ which assumes URI relative base.

Sorry for this disruption in test location, but it was necessary in order to provide for the needs of further testing. Confusion may go away when PR #105 is merged, but that would look like a harder failure than you experienced right now.

jeswr commented 11 months ago

@gkellogg Note that for this particular test the result file is now

_:genid1 <https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/#x> <https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/#y> .

when it appears to have been

_:genid1 <https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/turtle-subm-01.ttl#x> <https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/turtle-subm-01.ttl#y> .

when this issue was first raised.

I believe the bottom form is the correct one and is consistent with your statement "...if this value is used to set the base URI for parsing the input files, which should be the same is if they are run from their existing HTTP locations. ...".

Could this please be fixed for this and similar failing test cases seen here.

gkellogg commented 11 months ago

@jeswr, note that the manifest lists the test base using mf:assumedTestBase <https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/> ;. So, when parsing the Turtle input, @prefix : <#> . would expand to @prefix : <<https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/#> . Section 6.2 defines how to resolve IRI references, and points to RFC3986 section 5.2. Because we're overriding the document location, my interpretation is that (5.1.3) URI used to retrieve the entity does not apply, and we're specifying (5.1.4) Default Base URI (application-dependent).

To get the effect you're looking for, we would need to add something that defines that 5.1.3 is established as the base name of the document location (<turtle-subm-01.ttl> in this case for which the 5.1.4 is used to define the base IRI used for resolving <#> to get <https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/turtle-subm-01.ttl#>. Note that, for more complicated test directory structures, such as for rdf-xml, this wouldn't result in the previous interpretations.

What's clearly missing is an entry in ns/test-manifest.ttl that defines assumedTestBase, but I believe we're consistent that this is the URI base used for resolving relative IRIs, even though it does result in a different result than the original tests. (@afs may have a different opinion).

afs commented 11 months ago

Jena on the rdf11 directory currently:

Time: 0.348
There were 2 failures:
1) T-450: turtle-subm-01 (turtle-subm-01.ttl)
2) T-755: trig-subm-01 (trig-subm-01.trig)

FAILURES!!!
Tests run: 1007,  Failures: 2

The rdf-xml tests (e.g. rdfms-difference-between-ID-and-about/test1.nt`) follow "with file name form" - in their case, with subdirectory as well.

Test *-subm-27 aren't affected, it comes to the same outcome.

There are 20 RDF/XML tests with relative URIs.

Jena's test code interprets mf:assumedTestBase to mean the base URI of the manifest file.

 mf:action    <turtle-subm-01.ttl> ;

reading the file <turtle-sumb-01.ttl> in that context resolves the file name to give for the base for parsing .../turtle-subm-01.ttl#x.

turtle-subm-27 and trig-subm-27 aren't affected by the choice because there is no extra directory path, unlike RDF/XML tests.

gkellogg commented 11 months ago

So, the emerging opinion seems to be to define the semantics of mf:assumedTestBase to be the effective IRI or the manifest, rather than the base IRI used for running each test. Relative IRIs contained within the manifest are to be resolved against the assumed manifest IRI, even though they may be loaded from a local file. This would resolve <turtle-subm-1.ttl> to <https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/turtle-subm-1.ttl> and result in IRIs within that file expanding as <https://w3c.github.io/rdf-tests/rdf/rdf11/rdf-turtle/turtle-subm-01.ttl#x>, as before. This should also allow it to be used within the rdf-xml tests, which have a deeper structure.

I'll add a suitable definition to test-manifest.ttl (which should eventually be synced to W3C space) and update expected result files accordingly, in the various test suites.