Open gregschohn opened 1 year ago
I like the spirit of this proposal. I would love to abstract storage sufficiently that multiple versions of Lucene can coexist in the same OpenSearch version. But I also believe that read and write are both valuable paths that don't need to be delivered at the same time.
Assuming "everything" is hard, and If I had to prioritize, I would like to get https://github.com/opensearch-project/OpenSearch/issues/5451 as a first step. It would enable users to search old snapshots without having to reindex that data. I would like to then have the option not to reindex existing data when doing a major upgrade and instead putting indices in readonly mode. Finally, I would like everything that's proposed above.
I think the biggest challenge will be features and plugins (extensions). Code changes along with the underlying index format, and backwards compatibility is additional work.
Is your feature request related to a problem? Please describe. Each major version of Lucene changes behaviors in ways that can impact dependent clients and applications. Even something as minor as returning query results in a different order could be extremely difficult for some customers to properly and seamlessly accept, especially considering on their current staffing and budgets for a project that may have been completed years earlier.
Customers, of course, can choose to maintain their current indices by staying on old versions of ElasticSearch or OpenSearch. However, that limits their ability to embrace new features to improve their cluster's security, performance, or maintainability. If/when customers choose to upgrade their clusters to new versions, if a customer needs to continue to add data to the index, they're stuck having to reindex their existing indices and simultaneously accepting all of the Lucene changes.
Describe the solution you'd like Customers should be able to upgrade their cluster to a newer version of OpenSearch without needing to upgrade (reindex) their indices and accept new index behavior. That would allow customers to test-drive and/or adopt many new features (outside of Lucene) of OpenSearch with less upfront planning and risk. Clusters would still allow for reindexing an old index into the latest format, but when the customer is ready to. That would be managed by the customer separate from the cluster upgrade, not as a prerequisite to upgrading.
Such a solution should be supported such that if a customer has no old indices, there shouldn't be any performance or size hit. If there ARE older indices, the size hit should only be for what is necessary. The performance hit of using an old index should be imperceptible to how the same index behaved for an older version of the cluster. Plugins, with their own classloaders, seem like a great starting point to house alternate versions of Lucene and supporting libraries. Those isolated plugins should be able to read/write contents just as they were supported for older versions of OpenSearch and ElasticSearch.
Ideally, all of the interactions with an old index should be preserved from the older version of the cluster. Starting with a plugin to handle IO and indexing with legacy classes may be a good starting point. It likely won't be entirely sufficient though. A second phase, to increase inter-version response fidelity, will be necessary to refactor version-specific Lucene code away from the aggregations layers. There is already an issue for that refactoring.
This is expected to be an evolving feature that can start modestly with one old format. Tests can be constructed to show where differences exist, and additional refactoring can take place to remedy those areas. A final result would have a clean level of separation between Lucene indices and the rest of OpenSearch, which should have major improvements to the maintainability of OpenSearch. It could also make upgrading versions of Lucene more straightforward and intentional (see the commit to upgrade to Lucene 9).
Describe alternatives you've considered Reindexing along w/ an upgrade is always an option, but for some customers there will always be breaking & possibly sensitive changes. The proposal to support legacy indices won't defer the need for all or possibly even many customers from ever needing to upgrade/reindex their indices. Some breaking changes may be desirable (fixing weighting schemes, adding new datatypes, etc).
An entirely different, and useful approach to the problem of version inconsistencies is to improve and codify what changes between releases. That work is being undertaken in this issue. Additional work may be involved to try to transform content with breaking changes in ways that preserve legacy functionality with maximum fidelity. That too will not be able to be perfect and immediate.
A final long term approach may be to reindex the data and to preserve legacy behavior for an index. Even if an unwieldy number of conditionals could do that at the engine/core, internal Lucene details would be near impossible to emulate without unnecessarily over-complicating Lucene.
Allowing customers to make an index change on their own terms, with less moving parts should help more sensitive customers. OpenSearch still needs to simplify the overall process of migrating data between versions, including the tools to give customers more assurances. This will likely start with an ecosystem of reliable tools that can eventually dovetail into a toolbox to guide a pain-free migration.
Additional context