[Feature Request] Support a verbose/debugging param in search pipelines

ohltyler commented 4 months ago

Is your feature request related to a problem? Please describe

As the library of search pipeline processors continues to grow and become more complex, it can become increasingly difficult to know how data is passed around and transformed through the processors. An example is the introduction of ML inference processors which can have logic to transform arbitrary and complex model inputs/outputs.

Describe the solution you'd like

Having a way to debug and view the state of search processors (on request side and response side) would be helpful in discovering issues related to data transformation or any other intermediate failure. It could also be generally useful in being able to view the end-to-end pipeline execution and easily see how the request is transformed, executed, and how any response data is transformed. Additionally, this could be consumed and viewed on the flow framework frontend plugin, which is initially focused on the configuration and testing of ingest and search pipelines as users build out their complex use cases.

A few different implementation ideas:

Add a verbose parameter to a search request that contains a search pipeline, or standalone API, for returning the end-to-end breakdown of each processor's output - note this is already done today in ingest pipelines (see verbose param here)
Add some return_request parameter to a search request that contains a search pipeline, and return the finalized/transformed search request that was used to execute against an index. While this wouldn't be processor-level granularity, it could be a simple way to get some intermediate information.

Intuitively, I think Option 1 provides the most flexibility and simplest/straightforward design. It is also consistent with how ingest pipelines supports this idea.

Related component

Other

Describe alternatives you've considered

No response

Additional context

No response

ohltyler commented 2 weeks ago

We can further scope down the desired data output from this change. Listing them out below in order of priority:

Transformed search requests - given a pipeline with search request processors, expose the transformed search request as it passes through the processors (the individual search request processor output)
Interim outputs of each search request and search response processor - given a pipeline with search request and/or search response processors, output each interim transformation of the search request and search response (all processor outputs).
Success/failure of each processor
Completion timestamps of each processor, and/or some way to determine the time spent in a given processor

Item 1 : Will allow us to enable "Preview" on the UI when chaining search request processors. Item 2: Currently, on the UI, in order to view the "interim" outputs given some search response processor, we build out a temporary pipeline up to & including the selected processor, execute it, and display the result. Item 2 would greatly simplify this, as we could remove all of that custom logic, and instead execute the entire pipeline, and just parse out the selected processor's output. It will also lower the load to easily add to the UI to let users click on any processor and view its output, and/or view all interim outputs at once. Items 3/4: Will open up further UX opportunities to provide fine-grained details and debugging outputs for users building complex pipelines. (e.g., debugging which processor is causing issues or taking a long time to complete - maybe some lagging LLM response? etc.,...,)

reta commented 1 week ago

For search requests, we do support profiling (profile=true), I think including the data for each search processor (new profiler sections) would be a natural way to expose additional stats?

dbwiddis commented 1 week ago

Interim outputs of each search request and search response processor

This could include a very large number of hits, often paginated or limited to "top". If we're using this for debugging purposes we probably don't need all the hits; a representative sample should do, right?

owaiskazi19 commented 5 hours ago

This could include a very large number of hits, often paginated or limited to "top". If we're using this for debugging purposes we probably don't need all the hits; a representative sample should do, right?

I agree with @dbwiddis on this point. Since our primary goal is to understand how each processor transforms the data, we can limit the size of the search response to just one or two hits. This approach is similar to the ingest pipeline's _simulate API, which demonstrates how data will be transformed upon ingestion.

We don't have a comparable API for search pipelines because we can't simulate the response; we must make an actual search request to get even a sample search response. Given that the requirement for this issue is the verbose flag, it's prudent to limit the response size to avoid dealing with a large number of hits.

The key reasons for this approach are: It provides sufficient information to demonstrate the transformation process. It reduces the computational load and response size. It aligns with the primary goal of illustrating the pipeline's effect on the data.

opensearch-project / OpenSearch