apache / jena

Apache Jena
https://jena.apache.org/
Apache License 2.0
1.08k stars 646 forks source link

Loading JsonLD context fails when bumping up version #1765

Open markedone66 opened 1 year ago

markedone66 commented 1 year ago

Version

4.7.0

Question

Im using DocumentLoader to reroute the context URL, because the URL in context is not accessible in our application. So i made my own adaptation of these examples:

DocumentLoader

RDF Parser

Problem, is, it all works perfectly on Jena 4.4.0 (maybe even higher, did not test 5 or 6, 6.1) but as soon as i switch to latest -> 4.7.0 I start getting this error:

JsonLdError[code=There was a problem encountered loading a remote context [code=LOADING_REMOTE_CONTEXT_FAILED]., message=There was a problem encountered loading a remote context [code=LOADING_REMOTE_CONTEXT_FAILED].]

and then later down the stacktrace ...

Caused by: JsonLdError[code=The document could not be loaded or parsed [code=LOADING_DOCUMENT_FAILED]., message=Unexpected response code [403]]
org.apache.jena.riot.RiotException: There was a problem encountered loading a remote context [code=LOADING_REMOTE_CONTEXT_FAILED].

Jena should be redirected to a local file server that is running in my docker, i can access it without any issue when i enter debug and i click on the URL its supposed to go. its an HTTP with no auth, so it should be accessible at all times. As i said, works without any issues on 4.4.0 but i get 403 if i swap to 4.7.0 ... What has changed? do i need to add additional configuration or something?

Here is some of the logic:

JsonLdOptions options = new JsonLdOptions();
    DocumentLoader dl = new DocumentLoader();
    dl.addInjectedDoc(contextUrlFromPayload, contextFromFileServer);
    options.setDocumentLoader(dl);

    // pass the jsonldContext and JsonLdOptions to the read using a jena Context
    JsonLDReadContext jenaCtx = new JsonLDReadContext();
    jenaCtx.setOptions(options);

    // read the jsonld, replacing its "@context"
    Dataset ds = jsonld2dataset(eventPayload.toString(), jenaCtx, Lang.JSONLD);
    Model mod = ModelFactory.createDefaultModel().read(schemaUrl);

and

private Dataset jsonld2dataset(String jsonld, Context jenaCtx, Lang lang) {
    Dataset ds = DatasetFactory.create();

    RDFParser
      .create()
      .fromString(jsonld)
      .errorHandler(ErrorHandlerFactory.errorHandlerNoLogging)
      .lang(lang)
      .context(jenaCtx)
      .parse(ds.asDatasetGraph());
    return ds;
  }

Thanks a lot!

rvesse commented 1 year ago

I know we recently changed the default library for handling JSON-LD from https://github.com/jsonld-java/jsonld-java to https://github.com/filip26/titanium-json-ld and I don't know if that respects the same JsonLDReadContext configuration that the older library did

I think you may be able to force use of the old parser by using Lang.JSONLD10

markedone66 commented 1 year ago

Alright, that seems to fix the issue indeed. Thanks a lot. My question tho is. Do you plan on writing a new context reader that works with titanium? :D Or at least update the old one? Or what are the plans for future? Cause right now its looks like we can keep upgrading jena to higher version but we should still keep an eye out and specify Lang.JSONLD10 to force the old one. But this is going to be an issue if customer provides us with jsonld lets say 1.4 in future which is not gonna be compatible with 1.0 and the entire thing breaks.

afs commented 1 year ago

Jena wraps up access to the underlying engine which does the real work.

How does Titanium handle the situation?

(Pull requests gratefully received!)

afs commented 1 year ago

The JSON-LD 1.0 implementation, jsonld-java, is likely to be removed from Jena later this year because that project has been looking for a maintainer for sometime now and the last release was Dec 13, 2021.

In Jena 4.9.0, the JSON-LD 1.0 specific constants will be deprecated : see issue #1921, and PR #1922.

It would be helpful to know from JSON-LD 1.1 users how they deal with the issue when using Titanium.

ziodave commented 4 weeks ago

This is how I set the Document Cache and a custom DocumentLoader for JSONLD11 (Titanium):

    private final static JsonLdOptions JSONLD_OPTIONS_VALUE = new JsonLdOptions();

    static {

        // This set a document loader which loads empty contexts.
        JSONLD_OPTIONS_VALUE.setDocumentLoader((url, options) ->
                JsonDocument.of(new StringReader("{ \"@context\": { \"@vocab\": \"" + trailingSlashIt(url.toString()) + "\" } }")
                ));

        // Create a default document cache.
        val documentCache = Optional
                .ofNullable(JSONLD_OPTIONS_VALUE.getDocumentCache())
                .orElseGet(() -> new LruCache<>(256));

        try (val is = new ClassPathResource("contexts/schema.jsonld").getInputStream()) {
            val schemaJsonLdContext = JsonDocument.of(is);
            // Preload the schema.org JSON-LD context.
            documentCache.put("http://schema.org", schemaJsonLdContext);
            documentCache.put("https://schema.org", schemaJsonLdContext);
        } catch (JsonLdError | IOException e) {
            throw new RuntimeException(e);
        }

        JSONLD_OPTIONS_VALUE.setDocumentCache(documentCache);
    }

    /** Call to deserialize a string containing JSON-LD to a Model.
    public static Model model(String value) {
        return RDFParserBuilder
                .create()
                .context(Context.create().set(JSONLD_OPTIONS, JSONLD_OPTIONS_VALUE))
                .lang(Lang.JSONLD11)
                .fromString(value)
                .build()
                .toModel();
    }

I hope this helps.