Support writes with previous major lucene versions

dreamer-89 commented 1 year ago

Description

Allow write capabilities with previous major bwc lucene versions to suport rolling upgrades[2] on OpenSearch.

Background

Customers using OpenSearch performs upgrades to move to latest version of OpenSearch software. One of the possible upgrades is rolling upgrade[2] where each node is upgraded one at a time. This upgrade process result in intermediate state where few nodes are on latest OpenSearch version while others are still running on older version, creating a state of mixed version cluster. This state does not work well for segment replication[1] enabled indices because the primary shard copies over the segment files onto replica shard copies. This works fine when all nodes are running on same version but during upgrades it is possible that replica shard copy be running on a older version node and thus, does not understand the segment files written with newer codec on primary shard. This results in replica shard failures and impacts search availability.

Solution attempted and issue

OpenSearch attempted to solve mixed cluster issue by updating primary shard copies to keep using older codec until all replica copies are on latest software. This resulted in segment files written with older codec which replica shard can read. This works for upgrades where there is minor Lucene version bump but not when there is major Lucene bump. We identified from [4] and manual test that Lucene moves older codecs into bwc-codecs and only allow reads with all previous major versions. Thus, there is no write compatibility with previous major lucene versions and solution attempted on OpenSearch will not work for major Lucene version upgrades.

References

[1] Segment replication in OpenSearch [2] Rolling upgrades in OpenSearch [3] OpenSearch engine issue [4] Backward codecs in Lucene

jpountz commented 1 year ago

For reference there was a similar discussion at #10274.

msokolov commented 1 year ago

I'm confused as to why the solution described won't work with segment replication. As I understand it what that solution describes would be some writer nodes writing segments with old version while replicas' (readers) software gets upgraded. Since replicas only read they can continue to read these old (version N-1) index segments. Even if you wanted to allow them to do merging (why, I don't know) you could and they would write new-format segments, which is fine as long as they never share them with other replicas. Then once all replicas are on version N, you can upgrade the writer nodes to version N and start publishing version N segments. Where is the hitch - what am I missing?

dreamer-89 commented 1 year ago

I'm confused as to why the solution described won't work with segment replication. As I understand it what that solution describes would be some writer nodes writing segments with old version while replicas' (readers) software gets upgraded. Since replicas only read they can continue to read these old (version N-1) index segments. Even if you wanted to allow them to do merging (why, I don't know) you could and they would write new-format segments, which is fine as long as they never share them with other replicas. Then once all replicas are on version N, you can upgrade the writer nodes to version N and start publishing version N segments. Where is the hitch - what am I missing?

Thank you @msokolov for the comment and apologies for not having the description clear enough. You are right, there should not be any issue when writer is on non-upgraded node while readers on upgraded node.

The issue is the other way round where readers are on non-upgraded nodes while writers on upgraded nodes during major lucene version upgrade. Though, it works for minor lucene version upgrades.

itiyama commented 6 months ago

@msokolov @mikemccand What is your recommendation on this for minor version upgrades? Lucene does not support writes on the old codec format for minor versions as well, but I can potentially override the codec to use the old writer in my application. Since Lucene has not tested IndexWriter with the old codec format, it can potentially result in unknown bugs.

The challenge for Opensearch is that it stores the data in remote store, so upgrades are not seamless if we do not build the forward compatibility in Opensearch for old codec versions, so even minor version upgrades for Lucene become tricky.

msokolov commented 6 months ago

I thought "OpenSearch attempted to solve mixed cluster issue by updating primary shard copies to keep using older codec until all replica copies are on latest software" was solving this issue? How does a remote store alter the picture? I guess you need a remote store for the old version and a separate remote store for the new version?

itiyama commented 6 months ago

Opensearch never went ahead with the proposed solution since it does not work for major versions. I am wondering whether we should even rely on it for minor versions and need your help on the same. Opensearch downgrades are not seamless due to codec version compatibility issues during deployments.

Current deployment process Transition from OpenSearch 2.x (Lucene 9.4) to 2.x+1 (Lucene 9.9) involves moving all replicas followed by primaries. Downgrading primaries is unsupported as there is no version of the software that understands both old and new codec formats.

Proposed 2 phase deployment process Phase 1: Transition to OpenSearch 2.x+1 with Lucene 9.9, setting default write version to 9_4. Phase 2: Enable codec version 9_9 for OpenSearch 2.x+1.

In this phased deployment, there is a version of software at every stage which you can rollback to.

Challenges with supporting intermediate stage

Old codecs are not supported out of the box for writes even for minor version upgrades.
Since the writer logic is available in backward codecs for testing, I can still go ahead and override the codec write methods to work with old writers. But the path is not tested well in Lucene e.g. IndexWriter may not work with old codec version for writes, even for minor version upgrades. So, I am not comfortable in relying on this mechanism unless I am aware of compatibility risks.

Questions

Does Lucene officially support only the latest codec version for write operations?
Can I assess the compatibility risks associated with older codec versions on new Lucene software by running the entire test suite with older codec versions? Is this method sufficient for identifying potential issues?
For applications relying on Lucene's segment replication model, and lacking a stable software version for rollback, how can they address deployment risks without independently verifying compatibility? Alternatively, how can they manage deployments without a stable fallback option, potentially risking downtime during rollbacks? Should Lucene consider supporting this officially?

Remote store It is a special case for segment replication, so same problems exist.

msokolov commented 6 months ago

Since the writer logic is available in backward codecs for testing, I can still go ahead and override the codec write methods to work with old writers. But the path is not tested well in Lucene e.g. IndexWriter may not work with old codec version for writes, even for minor version upgrades.

Although it is possible that IndexWriter would somehow stop being able to write an older version of a codec, that seems unlikely for a minor release. It is true that nothing enforces that that works. However running unit tests with your backwards-supporting codec should be enough to have confidence that it works.

I think the answers to your yes/no questions are 1. Yes, 2. Yes. For 3, I'm not sure. It does seem like a difficult situation. I don't see how Lucene would support writing two index versions at the same time though. I think it sometimes happens that the backwards-codec implementations even drop support for writing, so it might not be a reliable solution to (2) in the general case.

On Tue, Apr 30, 2024 at 10:31 AM itiyama @.***> wrote:

Opensearch never went ahead with the proposed solution since it does not work for major versions. I am wondering whether we should even rely on it for minor versions and need your help on the same. Opensearch downgrades are not seamless due to codec version compatibility issues during deployments.

Current deployment process Transition from OpenSearch 2.x (Lucene 9.4) to 2.x+1 (Lucene 9.9) involves moving all replicas followed by primaries. Downgrading primaries is unsupported as there is no version of the software that understands both old and new codec formats.

Proposed 2 phase deployment process Phase 1: Transition to OpenSearch 2.x+1 with Lucene 9.9, setting default write version to 9_4. Phase 2: Enable codec version 9_9 for OpenSearch 2.x+1.

In this phased deployment, there is a version of software at every stage which you can rollback to.

Challenges with supporting intermediate stage

Old codecs are not supported https://github.com/apache/lucene/blob/branch_9_10/lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene90/Lucene90HnswVectorsFormat.java#L115 out of the box for writes even for minor version upgrades.

Since the writer logic is available in backward codecs for testing, I can still go ahead and override the codec write methods to work with old writers. But the path is not tested well in Lucene e.g. IndexWriter may not work with old codec version for writes, even for minor version upgrades. So, I am not comfortable in relying on this mechanism unless I am aware of compatibility risks.

Questions

Does Lucene officially support only the latest codec version for write operations?

Can I assess the compatibility risks associated with older codec versions on new Lucene software by running the entire test suite with older codec versions? Is this method sufficient for identifying potential issues?

For applications relying on Lucene's segment replication model, and lacking a stable software version for rollback, how can they address deployment risks without independently verifying compatibility? Alternatively, how can they manage deployments without a stable fallback option, potentially risking downtime during rollbacks? Should Lucene consider supporting this officially?

Remote store It is a special case for segment replication, so same problems exist.

— Reply to this email directly, view it on GitHub https://github.com/apache/lucene/issues/12391#issuecomment-2085490240, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHHUQIYJM47NNONJLDTLATY76TLPAVCNFSM6AAAAAAZR5C5NSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBVGQ4TAMRUGA . You are receiving this because you were mentioned.Message ID: @.***>

itiyama commented 6 months ago

Lucene does support reading N-1 index versions at the same time, and it should theoretically be feasible to enable writing for N-1 versions, though not concurrently. The software boots up with one index version, maintaining it throughout its runtime. This means that the software doesn't need to handle cases where the writer version is switched in memory but can perform all necessary checks at boot time. While this approach may involve high maintenance overhead for Lucene, I want to emphasize and understand the feasibility aspect better.

apache / lucene