Open artntek opened 1 year ago
In the past, we have often upgraded a solr schema.xml file without reindexing the entire corpus, especially for DataONE that has millions of versions of documents. That said, SOLR really recommends against this, particularly for major version upgrades. They discuss this here, specifically dealing with schema changes, solrconfig changes, and version upgrades: https://solr.apache.org/guide/solr/latest/indexing-guide/reindexing.html
One mechanism to avoid downtime is to reindex to a new SOLR collection, so that the old index continues to exist. Once the new collection is available, then we can use a collection alias to atomically switch the service from the old index content to the new content.
When thinking about a SOLR cloud update, it would be good to contemplate a rolling update strategy where 1) existing SOLR pods continue operating and serving an existing collection index, probably in read-only mode; 2) a new version of the service is rolled out, and the new PODS start up and begin reindexing the content into a new collection; 3) when the reindex is complete, the old pods are brought down, and the new pods begin serving requests, possibly after renaming with a collection alias. Extra bonus points for tying this seamlessly into Kubernetes rolling updates in a way that permits rollback to the original pods and indexed collection if for some reason the upgrade is not successful.
Of course, this is an ideal world -- we can manually manage this transition as well if such a set of features would significantly delay release.
Related background (but with elasticsearch): https://developers.soundcloud.com/blog/how-to-reindex-1-billion-documents-in-1-hour-at-soundcloud
Notes from related discussion in ESS-DIVE meeting: could use a Sidecar container - checks periodically for new index file, then triggers reindex if changes detected
This is important for 3.0.0 release, since that release involves a schema upgrade
Example
helm upgrade
changes the solr schema, we need to reindex all, because solr data on PV is compliant with old schemaFor 3.0.0 release, this will be manual, and we can choose not to reindex for huge corpuses. No end-user impact other than not being able to access that new info in metacatui (eg new license field)
Manage manually for 3.0; automate for 3.1
We need to decide upon and implement how we handle solr schema upgrades and their associated reindex actions.