opensearch-project / opensearch-benchmark-workloads

Official workloads used by OpenSearch Benchmark (OSB)
https://opensearch.org/docs/latest/benchmark/
19 stars 71 forks source link

Performance benchmark for wildcard field type #358

Open msfroh opened 4 months ago

msfroh commented 4 months ago

In 2.15, we added the wildcard field type that works well for cases where you want arbitrary substring matches (versus full-text search).

While we added unit tests and integration tests, we did not do any benchmarking. In particular, it would be nice to see how matches on arbitrary substrings on a wildcard field compare to similar queries on text/keyword field types. (Also, it would be interesting to see how much worse exact matches perform on wildcard fields versus text/keyword.)

We should be able to add wildcard fields to the http_logs and Big5 workloads. (It's extra indexing work, so we need to keep in mind that indexing numbers will get worse.)

IanHoang commented 1 month ago

After adding wildcard operations to http_logs and Big5, we could also create a separate test procedure, that is limited to text / keyword queries, for users to easily benchmark and compare the results.