Open navneet1v opened 1 year ago
@navneet1v I don't think this is a bug instead it is how the QueryPhaseSearcher
is exposed to the plugin and uses either the core implementation of QueryPhaseSearcher
or the one provided by plugin. It is upto a particular installation with set of plugins to see which one to use. If the use case here is to have multiple QueryPhaseSearcher
support (i.e. core + ones provided by plugins), then that functionality needs to be built based on such use case. For example: With concurrent search, since it is part of core we have created this QueryPhaseSearcherWrapper which will hold on to both DefaultQueryPhaseSearcher
and ConcurrentQueryPhaseSearcher
and use one or the other based on which path to take for a request.
A plugin can extend this or compose it based on the use case. Like plugin can provide a QueryPhaseSearcher, which can keep map of core provided searchers and its own and then based on request state decide which one to use. Or even build this in the core to make that decision. Not sure why it is tied to concurrent search GA. Based on https://github.com/opensearch-project/OpenSearch/issues/7020 my understanding was you or @martin-gaievski were doing some PoC for the same ? Based on the code shared here, seems like it needs to extend QueryPhaseSearcherWrapper
now instead of DefaultQueryPhaseSearcher
?
@sohami
A plugin can extend this or compose it based on the use case The problem here is the QueryPhaseSearcherWrapper is marked as
opesearch.internal
so it is subject to change. If it is marked asopesearch.api
then it would make more sense to extend QueryPhaseSearcherWrapper.Based on the code shared here, seems like it needs to extend QueryPhaseSearcherWrapper now instead of DefaultQueryPhaseSearcher ?
Yes we can change the neural search class to extend the QueryPhaseSearcherWrapper, but still it doesn't solve the problem. What will happen if some other backend plugin comes up and want to implement another query phase searcher. Thing is it cannot happen. Hence I think we should start thinking about this and not come up with these intermediate solutions. Reason why I am tieing this up with Concurrent Search is because we are adding it as a new Feature in OpenSearch Core and it changes the QueryPhaseSearcher.
Also, my main point here is the interface provided for extending QueryPhaseSearcher is not an actual extension.
It is upto a particular installation with set of plugins to see which one to use. Yes that is correct. But from OpenSearch project standpoint we provide full distribution and min distribution. So we need to make sure that in both those installation we are doing it correctly when more than 1 plugin is trying to override the QueryPhaseSearcher.
This is similar to the Codec extension point we had earlier in the OpenSearch. We extended that interface to make sure that more than one codec can be be present in OpenSearch. I was thinking we should come up with a similar solution here too. See https://github.com/opensearch-project/OpenSearch/pull/1387 and https://github.com/opensearch-project/OpenSearch/pull/1404
cc: @nknize
Yes we can change the neural search class to extend the QueryPhaseSearcherWrapper, but still it doesn't solve the problem. What will happen if some other backend plugin comes up and want to implement another query phase searcher. Thing is it cannot happen. Hence I think we should start thinking about this and not come up with these intermediate solutions
@navneet1v I didn't say in my last comment that extending the OpenSearch internal implementation is the best way to go. The way it was done currently is as I explained above where it expects depending on the installed plugin to either use plugin provided implementation or use the core ones. Given there are use cases to have multiple composite searchers we should definitely extend it and you can create PR with the change in core which solves the plugin use case in best way.
Reason why I am tieing this up with Concurrent Search is because we are adding it as a new Feature in OpenSearch Core and it changes the QueryPhaseSearcher
Concurrent search is changing the QueryPhaseSearcher
used internally in the core. It doesn't affect how the plugins are implementing the QueryPhaseSearcher
or the current extension points.
Also, my main point here is the interface provided for extending QueryPhaseSearcher is not an actual extension.
Agreed and I would think when QueryPhaseSearcher
interface was added in plugin this use case was not there (@reta
can correct me here) and hence it was provided in its current form. But again you can expand on the current implementation to provide a better one to help with the new use case and probably a generic one to help other plugins as well (if we know about those use cases).
Agreed and I would think when QueryPhaseSearcher interface was added in plugin this use case was not there (@reta can correct me here) and hence it was provided in its current form.
@sohami correct, we'd been looking to the implement one specific case, the concurrent search, in a way that does allow injection of different implementations by plugins.
@navneet1v what @sohami is saying is certainly one of the options to implement the desired flow: provide your own QueryPhaseSearcher but delegate to the ConcurrentQueryPhaseSearcher / DefaultQueryPhaseSearcher when needed (it is possible now without any changes in core). Also, the QueryPhaseSearcher
is not about just search anymore, it also includes AggregationProcessor
.
However, what we probably need is to have the contextual QueryPhaseSearcher that could complement (or may be even replace) the way QueryPhaseSearcher
is provided currently, something roughly like that:
interface ContextualQueryPhaseSearcherProvider {
Optional<QueryPhaseSearcher> forContext(SearchContext context);
}
That would allow plugins to register many providers but only one is going to be picked for the context. Obviously, the issue is that since QueryPhaseSearcher
s are not composable, plugins could provide competing implementations that somehow have to be resolved deterministically (same one always wins, others loose).
Describe the bug As part of Enabling the Normalization and Score Combination Feature(RFC: https://github.com/opensearch-project/neural-search/issues/126) in OpenSearch via Neural Search plugin, we used the Extension point of QueryPhaseSearcher(https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/search/SearchModule.java#L1248-L1259), but the problem with that extension point is it enables the Searcher at Cluster Level and not at the Index level. This make the Concurrent Search un-usable because another implementation of QueryPhaseSearcher is there.
So cutting this issue to start a conversation that, the fix needs to be done before Concurrent Segment Search can become GA.
To Reproduce Steps to reproduce the behavior:
Expected behavior Ideally there should be a way in which Plugins can define when to use their QueryPhaseSearcher at Index level. Also, core needs to make sure that all the QueryPhaseSearcher which are registered are initialized and during runtime based on the index/cluster setting right Searcher is picked up.
Plugins k-NN, MLCommons, Neural Search.
Screenshots NA
Host/Environment (please complete the following information): NA
Additional context The same issue was raised as a feature request too: https://github.com/opensearch-project/OpenSearch/issues/7020