opensearch-project / opensearch-benchmark-workloads

Official workloads used by OpenSearch Benchmark (OSB)
https://opensearch.org/docs/latest/benchmark/
11 stars 58 forks source link

[FEATURE] A new workload for semantic search #291

Closed martin-gaievski closed 1 month ago

martin-gaievski commented 1 month ago

Is your feature request related to a problem?

There is not any workload that would test semantic search, like neural search or hybrid search. We can start from workload that is targeted performance of semantic search queries.

What solution would you like?

Create a workload that is based on a dataset of documents with fields of multiple diverse types, like integer, keyword etc. This will allow to use different query types and combine results for semantic search use cases.

Good dataset to start from: https://dbyiw3u3rf9yr.cloudfront.net/corpora/noaa

I'm opening PR with corresponding implementation