Open donpellegrino opened 7 months ago
I think it's due to how RIOT (Jena's parser) is handling the TURTLE:
@Test
public void i18nTest() throws IOException, ParserException {
// https://github.com/rdfhdt/hdt-java/issues/203
String data = "@prefix : <http://example/vocab#>.\n" +
"\n" +
" :s1 :p <example://a/b/c/%7Bfoo%7D#xyz>.\n" +
" :s2 :p <eXAMPLE://a/./b/../b/%63/%7bfoo%7d#xyz>.\n";
try (InputStream is = new ByteArrayInputStream(data.getBytes(ByteStringUtil.STRING_ENCODING))) {
RDFParser build = RDFParser.source(is).lang(Lang.TURTLE).build();
build.parse(new StreamRDF() {
@Override
public void triple(Triple triple) {
System.out.println(triple);
}
@Override
public void start() {}
@Override
public void quad(Quad quad) {}
@Override
public void base(String s) {}
@Override
public void prefix(String s, String s1) { }
@Override
public void finish() {}
});
}
}
returns
http://example/vocab#s1 @http://example/vocab#p example://a/b/c/%7Bfoo%7D#xyz
http://example/vocab#s2 @http://example/vocab#p eXAMPLE://a/b/%63/%7bfoo%7d#xyz
The parser itself is configured in the org.rdfhdt.hdt.rdf.parsers.RDFParserRIOT#parse()
method if you want to get a look.
Should this issue be submitted upstream against Jena's RIOT instead of here in hdt-java?
I don’t know, it might be linked with a missing configuration from our side. It would be better to check it before
See https://lists.w3.org/Archives/Public/public-rdf-dawg/2005JulSep/0096 for details on the test case. Note the emphases on the presence of "." and ".." in the URLs. Test case definition:
Defined in https://github.com/w3c/rdf-tests/blob/main/sparql/sparql10/i18n/manifest.ttl
Using hdt-c++:
Using hdt-java:
Note that this test case is referenced multiple times in the hdt-java codebase (hdt-jena/testing/DAWG-Final/i18n and hdt-jena/testing/DAWG). However, it was unclear to me where these tests are being run to check their status.
Test system
hdt-c++
hdt-java: hdt-java-package-3.0.10