scientist-softserv / oral-history

UCLA LIBRARY-CENTER FOR ORAL HISTORY RESEARCH --Documenting the histories of Los Angeles-- The UCLA Library creates a vibrant nexus of ideas, collections, expertise, and spaces in which users illuminate solutions for local and global challenges. We constantly evolve to advance UCLA’s research, education, and public service mission by empowering and
https://oralhistory.library.ucla.edu/
0 stars 0 forks source link

[SPIKE] What would need to be done for a solr 8.11 upgrade #22

Closed crisr15 closed 5 months ago

crisr15 commented 1 year ago

Summary

Look into upgrade paths from solr 7.7 to 8.11. This ticket should explore what needs to be done when updating including if there is any migration, if items will need to be reimported, and whether or not wave files will need to be remade. (Please note, we would prefer not to remake the wave files as they are time consuming).

Acceptance Criteria

Related ticket created: https://github.com/scientist-softserv/oral-history/issues/25

summer-cook commented 1 year ago

I used ChatGPT for a first pass at this, and I think it did a pretty good job. I plan on checking the external resources to see if theres anything it missed but here is what it had so far:

Possible steps to follow

  1. Review the release notes for Solr 8.11 to identify any changes or deprecations that may impact the Oral History Rails app. Pay particular attention to changes related to schema, configuration files, and API changes.
  2. Make a backup of your Solr indexes and configuration files before upgrading to ensure that you can revert back if necessary.
  3. Install Solr 8.11 and configure it to work with the Oral History Rails app. This may involve updating your Solr configuration files to reflect any changes in the new version of Solr.
  4. If there are any changes to the Solr schema or configuration files, you may need to update the Oral History Rails app to reflect these changes. This may involve updating the solr-ruby gem or making changes to the application code to reflect changes in the Solr API.
  5. Depending on the changes in the Solr version, you may need to reindex your data with Solr 8.11. This will depend on how you originally indexed your data and whether the index format has changed in the new version of Solr.
  6. Since the Oral History Rails app appears to be using Solr for text search, it is unlikely that any WAV files used by the app would need to be recreated as part of the Solr upgrade.

Overall, the process for upgrading the Solr version used by the Oral History Rails app will involve reviewing changes in Solr 8.11, updating the Solr configuration files and the solr-ruby gem, and potentially reindexing data with the new version of Solr.

External Resources to learn more about upgrading from Solr 7.7 to Solr 8.11:

summer-cook commented 1 year ago

more important stuff

Solr has a new section of the Reference Guide, Reindexing which covers several strategies for how to reindex.

Configsets

It’s now possible to overwrite an existing configset when uploading changes by supplying the overwrite=true parameter to the Configset API.

A related parameter is cleanup=true, which allows deleting any files from the old configset that are left behind after the overwrite.

The default for both of these parameters is false.

When deleting a collection that has an automatically created configset (i.e., the configset was copied from the _default collection when the collection was created), the configset will also be deleted if it is not in use by any other collection.

Other useful links:

Solr 8.11 major changes: https://solr.apache.org/guide/solr/latest/upgrade-notes/major-changes-in-solr-8.html Major changes between 7 & 8: https://solr.apache.org/guide/8_11/major-changes-in-solr-8.html#major-changes-in-earlier-7-x-versions

summer-cook commented 1 year ago

Related ticket created: https://github.com/scientist-softserv/oral-history/issues/25

summer-cook commented 1 year ago

Things to note:

solr version will need to be updated in the dockerfile:

image
crisr15 commented 1 year ago

@summer-cook Will the wave forms be deleted/need to be recreated when re reindex? Where are those stored?

summer-cook commented 1 year ago

@crisr15 I didn't find anything in the docs that talked about wav files specifically. They most likely wouldn't need to be deleted, but reindexed along with everything else in the app. They have a specific section in their readme that says that creating the wav files is very time intensive Not sure exactly where the wav files are stored, but I'm pretty sure they use an oai feed. One other thing to note is I spoke to Jeremy about how blacklight is reindexed and here is our convo:

Image

I also found this in the apache solr docs:

Image

Because there is not a built-in blacklight equivalent to reindex like there is in spotlight/hyku, and oral history uses an oai feed, the way to reindex would be the same as however they got the files into solr in the first place, or potentially rerunning the oai feed.

I don't know if something similar to harvard's spotlight_oaipmh gem might be useful here, or if they already have a process in place for reindexing.