10up / ElasticPress

A fast and flexible search and query engine for WordPress.
https://elasticpress.io
GNU General Public License v2.0
1.25k stars 312 forks source link

Allow for the usage of the msearch API #2934

Open nickchomey opened 2 years ago

nickchomey commented 2 years ago

Is your enhancement related to a problem? Please describe.

I would like to be able to search multiple indexes with one search query, but right now ElasticPress appears to only override one query at a time.

Describe the solution you'd like

Elasticsearch has the msearch, Multi Search, API which allows you to send many separate queries in one request and receive all of the results at once. https://www.elastic.co/guide/en/elasticsearch/reference/current/search-multi-search.html

Describe alternatives you've considered

I've created my own msearchfunctionality by copying and modifying the contents of wp-content/plugins/elasticpress/includes/classes/Elasticsearch.php->query(). But I've consequently had to bypass the entire ElasticPress mechanism using the ep_skip_query_integration filter, which precludes the use of the majority of EP, its functions and filters - I'm essentially only using the Feature and Indexable APIs to create and sync custom indexes.

So, it would be better if msearchwas integrated into EP, perhaps with a filter in the query() function that can toggle _search/_msearch?

Additional context This is with regards to the #3 BuddyPress Integration, which requires the creation of many indexes for the various custom BP tables. It is obviously ideal to be able to do a full network-wide search for each search term.

felipeelia commented 2 years ago

Hey @nickchomey,

Unfortunately, I'm unclear what you are trying to achieve with the msearch API. Do you mind explaining it a bit further?

If you are trying to send the same query to different indices, you could use the ep_query_request_path filter, and add the indices to the request path. Another way to achieve that is by sending an array to the query_es method, like we do while processing the sites parameter here.

If you are trying to send different queries in the same request (the objective of the msearch API), it can get tricky really fast. One of the reasons is due to URL-based access control. If you are querying different indices (setting the different indices names in the query), it could be blocked by a security measure.

If that is not a problem, I think you could get the queries using Post::format_args(), join them as the msearch API expects and send them using the query() method, changing from _search to _msearch using the ep_query_request_path filter. Did you try that already? Thanks!

nickchomey commented 2 years ago

@felipeelia

Thanks for the response!

I'm simply trying to use the msearch api to do multiple searches with one request - the way the BuddyBoss Network Search mechanism works is it does a single mega mysql query for all the various searchables (members, groups, activity posts, CPTs etc...) and then processes them all at once in order to display the results on one, tabbed, page - you can see how that works in the screenshot I shared in #3.

So, the mechanism I use could either loop through the searchables and do a normal query() for each one, and then aggregate all the results into one array. Or, I could just use msearch.

You can use the various filters to modify the path, query etc... to run an msearch query, but the problem is with how the function handles the response - it isn't equipped for parsing an msearch response.

So, as described, I have achieved what I'm looking for with the msearch mechanism through a modified version of the query() function - without any problems that you've mentioned (the msearch API exists for a reason, after all...). Moreover, EP itself uses the Bulk API, which is listed in that document as a concern, so you've evidently figured out how avoid potential issues.

I see a few options:

  1. Create a separate m_query() function
  2. Add some sort of logic to the query() function to parse search vs msearch based on a parameter that is passed in
  3. Add some sort of filter prior to parsing the response, maybe at line 391, which would allow an external function to retrieve the response, do what it needs to do, and then skip the parsing if a particular value is returned, signifying that the filter was used successfully.

It seems to me that 1 or 2 is the right way to do this - officially support msearch. After all, with there now being many official Indexables and it being simple to create custom ones, it stands to reason that many people would like to do a single search across many indices. When I first started with all of this, I was very surprised that this wasn't already possible.

Is this helpful?

p.s In the end, I do think it is proper for me to be using the ep_skip_query_integration filter for my purposes - I've been able to work around that to make use of various EP mechanisms (e.g. for highlighting) by figuring out how to use the factory etc...

So, the only thing that should be under consideration here is whether msearch should be supported by EP - I strongly feel that it should.