Open vamshin opened 7 months ago
Yes, customers would like to see both scores from hybrid search, both for debugging and for training LTR models.
We were also hoping for this feature in 2.16 for our own work (with OSC) in tuning hybrid search using LTR.
Are we also going to support explain API for KNN queries like: https://github.com/opensearch-project/k-NN/issues/875?
The documentation on _explain says "The explain API is an expensive operation in terms of both resources and time. On production clusters, we recommend using it sparingly for the purpose of troubleshooting." If this is true, then returning the subquery scores via _explain is not going to be viable for LTR in production if the subquery scores are being used as features. Do we have a path to returning the scores more efficiently?
The documentation on _explain says "The explain API is an expensive operation in terms of both resources and time. On production clusters, we recommend using it sparingly for the purpose of troubleshooting." If this is true, then returning the subquery scores via _explain is not going to be viable for LTR in production if the subquery scores are being used as features. Do we have a path to returning the scores more efficiently?
I have the same question. Customers may need the absolute scores as input features for downstream systems. While current hybrid query just normalize the scores and we lose that information.
Is your feature request related to a problem?
Hybrid search doesn't return the scores of each individual query, making it difficult to debug why fragments were included/excluded
What solution would you like?
As part of
_explain
API, we should provide scores of sub queries for understanding/debugging the results.