eclipse-rdf4j / rdf4j

Eclipse RDF4J: scalable RDF for Java
https://rdf4j.org/
BSD 3-Clause "New" or "Revised" License
361 stars 163 forks source link

Upgrade Lucene/Solr/Elastic to latest version #4441

Open erikgb opened 1 year ago

erikgb commented 1 year ago

Problem description

While working on https://github.com/eclipse/rdf4j/issues/3559, I realize we have to upgrade Solr to version 9 - since Solr version 8 is based on Jetty 9, and we are forced to upgrade to Jetty version 10 to migrate to Java/Jakarta EE 8. Draft PR: https://github.com/eclipse/rdf4j/pull/4397.

Before performing a major upgrade of Solr, I suggest upgrading to the latest minor version and fixing deprecation warnings. This will ease the major version upgrade.

Preferred solution

Solr dependencies are bumped to the latest minor update and usage of deprecated APIs is fixed.

Are you interested in contributing a solution yourself?

Perhaps?

Alternatives you've considered

No response

Anything else?

No response

barthanssens commented 1 year ago

This is related to https://github.com/eclipse/rdf4j/issues/3396

Unfortunately there is also ElasticSearch, which is using - like Solr - Lucene libraries...

I'd prefer to keep both ES and Solr on the same version of Lucene... but no hard feelings if the more practical approach would be to include different versions depending on the search-solution being selected...

erikgb commented 1 year ago

Thanks @barthanssens! I was about to write a comment after discovering #3396. Somehow we need to sort out the issues related to the javax-jakarta migration. Maybe I should open a new discussion, or continue under https://github.com/eclipse/rdf4j/discussions/4433? WDYT? After fixing the immediate issues with WireMock in https://github.com/eclipse/rdf4j/pull/4439 this is now the blocker for a migration to Java/Jakarta EE 8,

barthanssens commented 1 year ago

Sure, I do remember that upgrading is not as straightforward as I thought it would be (see also the changes in 3396), but perhaps I've missed some obvious fixes/work-arounds...

barthanssens commented 1 year ago

Changes on ES include migrating to a new client library (again, see https://www.elastic.co/guide/en/elasticsearch/client/java-api-client/current/migrate-hlrc.html, appears to be Apache licensed https://mvnrepository.com/artifact/co.elastic.clients/elasticsearch-java/8.9.0), using Jetty 10 for Solr, and some changes in geo (some deprecated / moved methods or classes) in Lucene and/or ES

And the usual IP verification, especially for ES

hmottestad commented 1 year ago

The Java client for Elasticsearch provides strongly typed requests and responses for all Elasticsearch APIs. It delegates protocol handling to an http client such as the Elasticsearch Low Level REST client that takes care of all transport-level concerns (http connection establishment and pooling, retries, etc).

The Elasticsearch Low Level REST client is simply the elasticsearch-rest-client, and it's also Apache 2.0 :)

https://mvnrepository.com/artifact/org.elasticsearch.client/elasticsearch-rest-client

So if we use the Java API Client together with the low level test client then we could be fully Apache 2.0 for all our production code. Which would make upgrading to never versions that much easier in the future :)

barthanssens commented 1 year ago

The Elasticsearch Low Level REST client is simply the elasticsearch-rest-client, and it's also Apache 2.0 :)

Well, sort of, there are 3 client APIs, at least that's how I read the dos: the new java client library indeed uses the existing low-level client, but replaces the less-new rest-high-level client)

Let's hope for a clean separation (e.g no dependencies on say a common server package)