DSpace / DSpace

(Official) The DSpace digital asset management system that powers your Institutional Repository
https://wiki.lyrasis.org/display/DSDOC8x/
BSD 3-Clause "New" or "Revised" License
859 stars 1.29k forks source link

Upgrade DSpace to Solr 9 #9300

Open Leano1998 opened 5 months ago

Leano1998 commented 5 months ago

I took time examining the implications of a solr9 upgrade. Especially because of a comment in the Lyrasis Wiki describing that a implementation of solr9 in DSpace 7 shouldn't be a big problem, because we would only need to update the configuration of our solr core(s). This is partially correct. As far as I tested the solr9 works quite well with the suggested changes and these are inline with the solr changelogs. The main problem is - as far as I can see - the upgrade of the solr and lucene maven libraries for version 9. This would require some more work to have these libraries up to date with our solr version. But for now it is not mandatory for a solr upgrade, it would work without upgrading the libraries. I haven't seen any related issues while testing it. But it might be inconsistent to upgrade our solr support without upgrading the maven libraries.

I thought, I put these information here to leave it open for discussion.

tdonohue commented 5 months ago

@Leano1998 : Thanks for your notes on this. I want to warn though that we will be doing a major dependency update in the near future for 8.0 to solve #8713 . This major dependency update doesn't currently involve Solr (as it's not required as you noted) but it does involve most everything else (Spring, Hibernate, etc).

I wanted to call this out because this might be a good opportunity to also upgrade Solr... or do the Solr upgrade after those other major dependencies are updated. I agree that it'd be nice to move to Solr 9. It's just not as high priority right now as #8713.

The #8713 work is scheduled for Feb 26- Mar 15 (see the schedule at the top of the dev mtg agenda). We might be able to do the Solr upgrade during that same period. If you are interested in helping then, let me know. Otherwise, I can look for other volunteers.

For now, moving this to the 8.0 board and we'll see if we can fit this into 8.0

mwoodiupui commented 5 months ago

I would not consider it a high priority to upgrade SolrJ in DSpace unless we want some fix or new feature that is brought by an upgraded SolrJ. Solr is quite tolerant of version skew between client and server.

Lucene version N will refuse to open a core which was created with version N-2 or earlier, but this is not a client issue.

Occasionally something like a field type will be deprecated and then removed across major releases. This happened in DSpace's upgrade from Solr 4 to Solr 8. Again this is not a client issue -- it involves schema changes that should be transparent to clients.

I suggest that we should strive to decouple our version requirements for Solr client and server as much as is reasonable. This should not be difficult. We should carefully test DSpace's use of Solr to assure ourselves that a client upgrade is not critical, and treat client upgrades as something that we do as time permits. Solr's web API is quite stable, and SolrJ just handles the interface between that and Java objects.

I should add that in general I see no compelling reason to avoid upgrading SolrJ. It would be well to keep the client version fairly recent. I just think there is no reason why client and server must be upgraded in lockstep.

Leano1998 commented 5 months ago

Thanks to you for your thoughts on this issue. I would agree with @mwoodiupui that I don't see a cogent reason to upgrade client and server at the same time. We tested solr9 with DSpace 7 in different settings at our institution and it is working fine so far.

But if we offer a solr 9 support in DSpace 8 we have to consider how we handle a solr 8 compatibility because of the location changes of the contributed libraries. We could leave the Solr-8 paths in comments and provide additional documentation. I could try to create the PR and documentation for this, but I don't know, if I'll find the time to do a SolrJ update until the end of march.

tdonohue commented 4 months ago

NOTE OF CAUTION: While working on our upgrade to Jakarta EE compatibility (#9321), I've discovered that Solr does not have full Jakarta EE compatibility yet, which can complicate this upgrade.

This incompatibility is the reason why Spring Boot 3 has (temporarily) removed all autoconfiguration for Apache Solr. See https://github.com/spring-projects/spring-boot/issues/31054 (which confirms the details I've noted above)

In other words, we may need to consider whether to stay on Solr 8 until Solr achieves Jakarta EE compatibility by updating to use Jetty 11 or later. Here are the open Solr tickets that we should watch:

Based on recent comments to those tickets, a Jetty upgrade is being considered for Solr 10.

mwoodiupui commented 4 months ago

I will observe again that we need to carefully distinguish Solr and SolrJ. SolrJ has these Jakarta issues. Solr v5+ is self-contained and thus does not. Our tests make use of embedded Solr so we do have to be careful there. But sites should be able to upgrade Solr to v9 regardless that DSpace uses SolrJ v8. The above-mentioned incompatibilities are relevant to this Issue but only w.r.t. DSpace itself, not to its external infrastructure requirements.