eclipse-rdf4j / rdf4j

Eclipse RDF4J: scalable RDF for Java
https://rdf4j.org/
BSD 3-Clause "New" or "Revised" License
359 stars 161 forks source link

JSONLDSettings should permit setting the DOCUMENT_LOADER for the new (Titanium = Hasmac) DocumentLoader #5080

Open vorburger opened 1 month ago

vorburger commented 1 month ago

Current Behavior

One cannot set a custom (new) no.hasmac.jsonld.loader.DocumentLoader on JSONLDSettings in v5.0.1 - because DOCUMENT_LOADER is still the (old) com.github.jsonldjava.core.DocumentLoader instead.

Expected Behavior

You would either want to switch DOCUMENT_LOADER from com.github.jsonldjava.core.DocumentLoader to no.hasmac.jsonld.loader.DocumentLoader (preferred), or alternatively introduce some sort of DOCUMENT_LOADER_NEW or something (ugly).

Version

5.0.1

Are you interested in contributing a solution yourself?

Perhaps?

Anything else?

The more general issue here is that, coming from #4907, I've found that the "switch" from jsonldjava to Titanium/Hasmac JSON-LD is still somewhat partial, only; there's still a legacy implementation (in a separate artifact; fine, of course) but the RIO API still depends on com.github.jsonld-java:jsonld-java - because of the use of com.github.jsonldjava.core.DocumentLoader in JSONLDSettings.

The way I personally would solve this modularity conundrum is by moving those settings into separate e.g. LegacyJSONLDSettings (for jsonld-java) - but have neither in the API artifact; so move JSONLDSettings (without the com.github.jsonldjava.core.DocumentLoader) from core/rio/api/src/main/java/org/eclipse/rdf4j/rio/helpers to core/rio/jsonld/src/main/java/org/eclipse/rdf4j/rio/jsonld. (This would be a backwards incompatible API change, of course.)

This issue (probably) isn't going to block me, as in my application (I'll share the code, once I push it) I can just directly use EXPAND_CONTEXT which already uses no.hasmac.jsonld.document.Document.

But this (to me) somewhat mix of old/new JSON-LD libraries still seems like something perhaps worth cleaning up, some day.

Let me know if you would welcome a PR about this (but it's probably better to discuss and agree upon the "design" here beforehand).

hmottestad commented 1 month ago

Some relevant discussion: https://github.com/eclipse-rdf4j/rdf4j/discussions/5068

hmottestad commented 1 month ago

Originally we had copied over all the settings into the various Rio modules. This way we could allow for a migration path for users where they would have time to adopt the new settings before we removed them.

The use of enums had complicated this for our JSON-LD Rio module, so we didn't have any migration options for users in RDF4J 4.3.x.

I didn't want to break too many applications that depend on RDF4J, so I opted to not split out the settings for JSON-LD in RDF4J 5.

Not adding support for configuring a document loader with the new JSON-LD implementation was an oversight on my part though. I should have added a new setting for that.

We can get this added for RDF4J 5.1. Would appreciate some inspiration for what we should call the new setting.

Maybe

hmottestad commented 1 month ago

Or we could expose the entire json-ld options object from HASMAC JSON-LD.

barthanssens commented 1 month ago

Hmz, what about keeping the name of setting, but just make it a plain Object instead of a com.github.jsonldjava.core.DocumentLoader ?

That way both the old and the new JSON-LD parser would be able to use it (I guess), and it would remove the old jsonld-java library as a direct depency from the rio helpers package ?

vorburger commented 1 month ago

This issue (probably) isn't going to block me, as in my application (I'll share the code, once I push it)

FYI this is about RdfReaderConverterInto.java in https://github.com/enola-dev/enola/pull/796 (WIP).

Would appreciate some inspiration for what we should call the new setting.

DOCUMENT_LOADER_HASMAC is my favorite among the options you listed, because with that strongly typed API, this settings is de-facto specific to HASMAC anyway - so I like that name clearly expressing it. And you already have a "hard" dependency from API to HASMAC anyway (because of that FRAME and EXPAND_CONTEXT), so it's just one more.

Or we could expose the entire json-ld options object from HASMAC JSON-LD.

I'm not 100% sure what you mean by "the entire json-ld options object", but just having e.g. an HasmacJsonLdSettings only in <artifactId>rdf4j-rio-jsonld instead of <artifactId>rdf4j-rio-api to me seems the best because cleanest way to solve this modularity conundrum! The way it's currently done feels a little bit like "cheating," to me... 😃 as in, why have modules at all, then?

Hmz, what about keeping the name of setting, but just make it a plain Object instead of a com.github.jsonldjava.core.DocumentLoader ?

That would work as well, of course; just a less type-safe, seems a tiny bit like a shame having that RioSetting<T extends Object> design but then not actually using it? Using the same setting for both the old and new implementation also makes errors when upgrading more difficult to catch (runtime type cast exception only, instead of clear compile time). Ultimately this is probably more your (maintainers) choice than mine.. it's a trade-off on backwards compatibility and all. But if you are cool with breaking API in 5.1.0 anyway, then personally I would not do this, but go for an HasmacJsonLdSettings - if it was up to me.

This could also be done in 2 stages, of course.

hmottestad commented 1 month ago

This is the JSON-LD options object: https://github.com/HASMAC-AS/hasmac-json-ld/blob/main/src/main/java/no/hasmac/jsonld/JsonLdOptions.java

We can't make any breaking changes until RDF4J 6.

vorburger commented 1 month ago

This is the JSON-LD options object

Oh, I see; yeah I guess exposing it as RioSetting<JsonLdOptions> via an HASMAC_OPTIONS setting would make great sense.

We can't make any breaking changes until RDF4J 6.

Right, so then probably best in 2 stages:

  1. For a 5.1.0 add an RioSetting<JsonLdOptions> to a new class org.eclipse.rdf4j.rio.jsonld.HasmacJsonLdSettings extends org.eclipse.rdf4j.rio.helpersJSONLDSettings in rdf4j-rio-jsonld (NOT to rdf4j-rio-api)

  2. For a future 6.0.0 clean this up and remove the remove the old jsonld-java library as a direct dependency of the API module.

Does that sound reasonable? Let me know if you would welcome a (1st) PR about this.

hmottestad commented 1 month ago

I'm not completely sure if we should create a new settings class or not. The way that we've done it for the other settings classes is that we have duplicated them within their respective Rio modules. And we kept the same class name as before. When we tried the same for the json-ld settings we ran into issues with the use of enums.

I think I want to play around with it a bit to see what makes sense. If you want to make a PR it would be great, just keep in mind that we might need some changes. Hope that's ok :)