opensearch-project / opensearch-benchmark-workloads

Official workloads used by OpenSearch Benchmark (OSB)
https://opensearch.org/docs/latest/benchmark/
19 stars 68 forks source link

[BUG] Encountering bug in integration tests for Train Model KNN Vectorsearch #348

Closed IanHoang closed 3 months ago

IanHoang commented 3 months ago

What is the bug?

332 introduced this bug that is impacting Integration tests in OSB repository.

it/proxy_test.py:87: AssertionError
------------------------------ Captured log call -------------------------------
Error: FO     osbenchmark.utils.process:process.py:128 [ERROR] Cannot list. Could not load '/home/runner/work/opensearch-benchmark/opensearch-benchmark/.benchmark/benchmarks/workloads/default/vectorsearch/workload.json': Expecting value: line 484 column 26 (char 11545). Lines containing the error:

            "training_index": "train_index",
            "training_field": "train_field",
            "search_size": "[1000](https://github.com/opensearch-project/opensearch-benchmark/actions/runs/9998125448/job/27636295748?pr=588#step:16:1001)0", 
            "dimension": ,
-------------------------^ Error is here
            "method": {
                "name": "ivf", 

The complete workload has been written to '/tmp/tmp8vue0u1e.json' for diagnosis. 

Suggestion: Verify that [vectorsearch] workload has correctly formatted JSON files and Jinja Templates. For Jinja2 errors, consider using a live Jinja2 parser. See common workload formatting errors:
    ---------------------------------------------------------------------------------------------------------------------------
    [Common workload formatting errors:] 

    - Jinja2 expression missing parameters (e.g. got {{search_clients}} but needs {{search_clients | default(8)}})

    - Jinja2 expression missing "tojson" parameter when needed(e.g. got {{index_settings | default({})}} but needs {{index_settings | default({}) | tojson}})

    - JSON file might not be correctly formatted after rendering Jinja2 (e.g. additional brackets (}, ]) or missing commas (,))
    ---------------------------------------------------------------------------------------------------------------------------

How can one reproduce the bug?

https://github.com/opensearch-project/opensearch-benchmark/actions/runs/9998125448/job/27636295748?pr=588

IanHoang commented 3 months ago

@finnroblin Need to add a default value here or an if statement if this is not needed. Could you provide input on what the default value should be and cut a PR for this ASAP (as it is currently blocking integration tests in OSB repository)? https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/29d9715cf03df68380e801c64a87d6be20e09e5a/vectorsearch/test_procedures/common/train-model-schedule.json#L17