opensearch-project / opensearch-benchmark

OpenSearch Benchmark - a community driven, open source project to run performance tests for OpenSearch
https://opensearch.org/docs/latest/benchmark/
Apache License 2.0
111 stars 77 forks source link

Update VectorSearch Core Operations: Add rescore option for vector queries #517

Open jmazanec15 opened 6 months ago

jmazanec15 commented 6 months ago

Description

For some of our vector search cases, we use quantization to reduce the amount of memory that the index will consume during search. This comes with the tradeoff that recall will be worse because the quantization is lossy. One strategy to improve recall is to "rescore" the top results from the vector search with scores that originate from the full precision vectors. For example, the query would look like this:

GET my-knn-index-1/_search
{
    "size": 4,
    "query": {
        "knn": {
            "field_name": {
                "vector": [...],
                "k": 100
            }
        }
    },
    "rescore": {
        "window_size": 100,
        "query": {
            "rescore_query": {
                "script_score": {
                    "query": {
                        "match_all": {}
                    },
                    "script": {
                        "lang": "knn",
                        "source": "knn_score",
                        "params": {
                            "field": "field_name",
                            "query_value": [],
                            "space_type": "l2"
                        }
                    }
                }
            }
        }
    }
}

I would like to add this option to the existing vector search queries in a configurable fashion.

IanHoang commented 6 months ago

Similar to the previous issue opened (#516) and since this is more tailored to vectorsearch, please create this issue in the OSB workloads repository.

To enhance existing vector search queries, this can be done adding custom param sources workload.py.

Closing this issue as the suggested changes does not relate to core OSB repository code.

Reopened as these would apply to already built-in OSB core operations that were added a while back.