[FEATURE] Hybrid Search should provide scores of sub queries for understanding/debugging the results.

opensearch-project / neural-search

Plugin that adds dense neural retrieval into the OpenSearch ecosytem

Apache License 2.0

63 stars 66 forks source link

[FEATURE] Hybrid Search should provide scores of sub queries for understanding/debugging the results. #658

Open vamshin opened 7 months ago

vamshin commented 7 months ago

Is your feature request related to a problem?

Hybrid search doesn't return the scores of each individual query, making it difficult to debug why fragments were included/excluded

What solution would you like?

As part of _explain API, we should provide scores of sub queries for understanding/debugging the results.

smacrakis commented 2 months ago

Yes, customers would like to see both scores from hybrid search, both for debugging and for training LTR models.

smacrakis commented 2 months ago

We were also hoping for this feature in 2.16 for our own work (with OSC) in tuning hybrid search using LTR.

yuye-aws commented 2 months ago

Are we also going to support explain API for KNN queries like: https://github.com/opensearch-project/k-NN/issues/875?

smacrakis commented 2 months ago

The documentation on _explain says "The explain API is an expensive operation in terms of both resources and time. On production clusters, we recommend using it sparingly for the purpose of troubleshooting." If this is true, then returning the subquery scores via _explain is not going to be viable for LTR in production if the subquery scores are being used as features. Do we have a path to returning the scores more efficiently?

zhichao-aws commented 2 months ago

The documentation on _explain says "The explain API is an expensive operation in terms of both resources and time. On production clusters, we recommend using it sparingly for the purpose of troubleshooting." If this is true, then returning the subquery scores via _explain is not going to be viable for LTR in production if the subquery scores are being used as features. Do we have a path to returning the scores more efficiently?

I have the same question. Customers may need the absolute scores as input features for downstream systems. While current hybrid query just normalize the scores and we lose that information.