Closed conggguan closed 4 months ago
Intro
Now implement two-phase feature by adding rescoreContext in the begin of HybridQueryPhaseSearcher's. It is harmless but little hack, so there is a long term plan to replace this implementation.
RFC in opensearch core
What we need to implement this feature is the whole query and searchContext, so if OpenSearch Core can open a API which can help me get it before the query performed, we can move the rescoreContext function to a more general interface of core.
RoadMap of this optimizing
- RFC in OpenSearch Core.
- A Incremental update in NeuralSearch plugin.
Please create a RFC on the opensearch core side and provide details why we cannot open up an interface from core to implement this functionality?
I was not able to complete the full review of the PR as the PR contains more than 1 feature. The description of the PR says it has these 2 feature:
Enhance the speed of neuralsparse query by two-phase. Now support top-level, boolean and boost compound query; for other compound query will degrade into origin logic and speed.
But in the PR we have code add the feature to use tokens in the NeuralSparseQuery clause.
I was not able to complete the full review of the PR as the PR contains more than 1 feature. The description of the PR says it has these 2 feature:
Enhance the speed of neuralsparse query by two-phase. Now support top-level, boolean and boost compound query; for other compound query will degrade into origin logic and speed.
But in the PR we have code add the feature to use tokens in the NeuralSparseQuery clause.
Yes we merge #693 manually in advance because the query by tokens help us to write IT cases more flexible. Will update this PR after #693 get merged (will be merged soon)
I was not able to complete the full review of the PR as the PR contains more than 1 feature. The description of the PR says it has these 2 feature:
Enhance the speed of neuralsparse query by two-phase. Now support top-level, boolean and boost compound query; for other compound query will degrade into origin logic and speed.
But in the PR we have code add the feature to use tokens in the NeuralSparseQuery clause.
Updated now.
@vibrantvarun Some BWC test failed because of there is a version check of stream.
// stream in
if (NeuralSparseTwoPhaseParameters.isClusterOnOrAfterMinReqVersionForTwoPhaseSearchSupport()) {
this.neuralSparseTwoPhaseParameters = in.readOptionalWriteable(NeuralSparseTwoPhaseParameters::new);
}
// stream out
if (NeuralSparseTwoPhaseParameters.isClusterOnOrAfterMinReqVersionForTwoPhaseSearchSupport()) {
out.writeOptionalWriteable(this.neuralSparseTwoPhaseParameters);
}
But at now the PR hasn't been backport to V_2_14_0, so after merge will success. Or we can set Version.CURRENT to MinReqVersion and make a further PR to change it to 2.14 again. If a extra pr required or just keep is okay?
Lets ensure that all BWC tests are successful before this change can be merged.
Some BWC test failed because of there is a version check of stream.
// stream in if (NeuralSparseTwoPhaseParameters.isClusterOnOrAfterMinReqVersionForTwoPhaseSearchSupport()) { this.neuralSparseTwoPhaseParameters = in.readOptionalWriteable(NeuralSparseTwoPhaseParameters::new); } // stream out if (NeuralSparseTwoPhaseParameters.isClusterOnOrAfterMinReqVersionForTwoPhaseSearchSupport()) { out.writeOptionalWriteable(this.neuralSparseTwoPhaseParameters); } But at now the PR hasn't been backport to V_2_14_0, so the 2.14 node can not deserialize the stream correctly. It can only work after backport success. One workaround is using Version.Current, but we need to modify it to 2.14 after backport finished.
@conggguan please add BWC test for the change. It gives more confidence. Also I see you have significant changes in constructors that too version specific. BWC test is a must for these type of changes.
cc: @navneet1v
Description
This change implement for #646
Feature support query
Feature implementation optimize plan
Intro
Now implement two-phase feature by adding rescoreContext in the begin of HybridQueryPhaseSearcher's. It is harmless but little hack, so there is a long term plan to replace this implementation.
RFC in opensearch core
To add a correct rescoreContext to searchContext, we need both searchContext and whole compound query, but now if we can only get these together in the HybridQueryPhaseSearcher. In QueryBuilder, the searchContext is private, and whole query are not visible. What we need to implement this feature is the whole query and searchContext, so if OpenSearch Core can open a API which can help me get it before the query performed, we can move the rescoreContext function to a more general interface of core.
RoadMap of this optimizing
Issues Resolved
Resolve #646
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.