Open gforcada opened 5 months ago
@gforcada I was thinking about the same thing, but haven't had a chance to work on it. I think a key thing to solve is making sure that the indexing of the new collection has a way to catch up with changes to any documents that are modified during the reindex process.
😖 sorry, way too many things on my plate as of late 🙃
Thinking it twice, the two collections solution does not fit to fix the first problem: upgrading to a new version, as you can not have two different solr versions on the same server...
So, allowing to configure multiple Solr instances would be the solution here? 🤔
Probably we are approaching it the wrong way, Solr itself has to have some tooling around that...
Solr is great, but it has a few downsides:
... and that's specially hurting if reindexing the complete website takes a sizeable amount of time (for us around 24h hours).
💡 One mitigation strategy we have been using is to make the changes on non-production environment, and as soon as the critical amount of content has been reindexed, move Solr data from non-production to production and finish the reindexing there.
Another strategy that I read somewhere (probably on the solr docs) is to configure a second parallel collection, do the full reindex there (while the existing collection is still being used), and whenever reindexing has catch-up, switch them over ✨
Would that be something that could be done within
collective.solr
? 🤔