opensearch-project / opensearch-benchmark

OpenSearch Benchmark - a community driven, open source project to run performance tests for OpenSearch
https://opensearch.org/docs/latest/benchmark/
Apache License 2.0
111 stars 78 forks source link

[META] Synthetic data corpus generator #617

Open gkamat opened 3 months ago

gkamat commented 3 months ago

Is your feature request related to a problem? Please describe

While OSB does come with a number of workloads and associated data corpora, they are limited in scope and size. Being able to generate documents synthetically based upon a specification provided by the user would be a useful capability. This will permit arbitrary-sized corpora to be created as well, and will work well with the data-stream model.