qdrant / vector-db-benchmark

Framework for benchmarking vector search engines
https://qdrant.tech/benchmarks/
Apache License 2.0
250 stars 68 forks source link

Use efficient filtering for opensearch #167

Open igniting opened 1 month ago

igniting commented 1 month ago

Current filtering query does post-filtering which is not ideal and resulted in low precision.

Results with post filtering:

{
  "params": {
    "dataset": "arxiv-titles-384-angular-filters",
    "experiment": "opensearch-default",
    "engine": "opensearch",
    "parallel": 1,
    "config": {
      "knn.algo_param.ef_search": 128
    }
  },
  "results": {
    "total_time": 708.6168032020068,
    "mean_time": 0.07045893384491791,
    "mean_precisions": 0.11399200000000001,
    "std_time": 0.06840096039381999,
    "min_time": 0.008397486002650112,
    "max_time": 3.3753458530118223,
    "rps": 14.111999538838594,
    "p95_time": 0.18164870390755816,
    "p99_time": 0.20864198897208555
  }
}

Results with new efficient filtering:

{
  "params": {
    "dataset": "arxiv-titles-384-angular-filters",
    "experiment": "opensearch-default",
    "engine": "opensearch",
    "parallel": 1,
    "config": {
      "knn.algo_param.ef_search": 128
    }
  },
  "results": {
    "total_time": 394.4290532110026,
    "mean_time": 0.03913764159695711,
    "mean_precisions": 0.610144,
    "std_time": 0.05352479065894972,
    "min_time": 0.0009066620114026591,
    "max_time": 2.1307434440095676,
    "rps": 25.35310195481576,
    "p95_time": 0.1274049270534305,
    "p99_time": 0.2078282342318563
  }
}