opensearch-project / opensearch-benchmark-workloads

Official workloads used by OpenSearch Benchmark (OSB)
https://opensearch.org/docs/latest/benchmark/
16 stars 61 forks source link

[FEATURE] Workload for semantic search that has vector data and text fields #341

Open martin-gaievski opened 1 month ago

martin-gaievski commented 1 month ago

Is your feature request related to a problem?

There isn't a workload for semantic search that has vector data and text fields. Such workload would be beneficial for hybrid query use cases because typical hybrid query is based on combination of neural/knn query and some text or numeric based queries like match or term.

Existent workload requested in https://github.com/opensearch-project/opensearch-benchmark-workloads/issues/291 has limitations of only text and numeric fields.

What solution would you like?

Workload that allows to run semantic search queries that is based on vector search and combine them with text search. In terms of metrics we can start from performance data like p50-p99 and throughput, information retrieval data can be added later.

gkamat commented 1 month ago

@martin-gaievski, perhaps requests for such related workloads could be consolidated under a meta issue, for better visibility?

martin-gaievski commented 1 month ago

@martin-gaievski, perhaps requests for such related workloads could be consolidated under a meta issue, for better visibility?

Makes sense, added meta issue https://github.com/opensearch-project/opensearch-benchmark-workloads/issues/354