opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.74k stars 1.8k forks source link

[BUG] The split response processor is not enabled in 2.16 #15229

Closed dblock closed 2 months ago

dblock commented 2 months ago

Describe the bug

The split response processor introduced in 2.16 per the documentation is not enabled.

https://opensearch.org/docs/latest/search-plugins/search-pipelines/split-processor/

Related component

Search

To Reproduce

Coming from https://github.com/opensearch-project/opensearch-api-specification/pull/505 with the following test file.

$schema: ../../../../../json_schemas/test_story.schema.yaml

description: |-
  Test the creation of a search pipeline with a response processor.
prologues:
  - path: /movies/_doc/1
    method: POST
    parameters:
      refresh: true
    request:
      payload:
        names: Drive, 1984, Moneyball
    status: [201]
epilogues:
  - path: /_search/pipeline/names_pipeline
    method: DELETE
    status: [200, 404]
  - path: /movies
    method: DELETE
    status: [200, 404]
version: '>= 2.16'
chapters:
  - synopsis: Create search pipeline.
    path: /_search/pipeline/{id}
    method: PUT
    parameters:
      id: names_pipeline
    request:
      payload:
        response_processors:
          - split:
              field: names
              separator: ', '
              target_field: split_names
    response:
      status: 200
      payload:
        acknowledged: true
  - synopsis: Query created pipeline.
    path: /_search/pipeline/{id}
    method: GET
    parameters:
      id: names_pipeline
    response:
      status: 200
  - synopsis: Search.
    path: /{index}/_search
    method: GET
    parameters:
      index: movies
      search_pipeline: names_pipeline
    response:
      status: 200
      payload:
        hits:
          total:
            value: 1
          hits:
            - _index: movies
              _source:
                names: Drive, 1984, Moneyball
                split_names:
                  - '1984'
                  - Drive
                  - Moneyball
$ npm run test:spec--insecure -- --tests tests/default/_core/search/pipeline/split.yaml  --verbose

> opensearch_api_tools@1.0.0 test:spec--insecure
> ts-node tools/src/tester/test.ts --opensearch-insecure --tests tests/default/_core/search/pipeline/split.yaml --verbose

[INFO] Authenticating with admin ...
[INFO] Connecting to https://localhost:9200 ... (1/20)
OpenSearch 2.16.0

[INFO] => POST /movies/_doc/1 ({
  "refresh": true
}) [application/json] {
  "names": "Drive, 1984, Moneyball"
}
[INFO] <= 201 (application/json; charset=UTF-8) | {
  "_index": "movies",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "forced_refresh": true,
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}
[INFO] => PUT /_search/pipeline/names_pipeline ({}) [application/json] {
  "response_processors": [
    {
      "split": {
        "field": "names",
        "separator": ", ",
        "target_field": "split_names"
      }
    }
  ]
}
[INFO] <= 400 (application/json) | {
  "root_cause": [
    {
      "type": "illegal_argument_exception",
      "reason": "Invalid processor type split"
    }
  ],
  "type": "illegal_argument_exception",
  "reason": "Invalid processor type split"
}

Expected behavior

Processor to be enabled.

Additional Details

2.16

Haven't tried but looks like adding the processor to search.pipeline.common.response.processors.allowed may be a workaround.

dbwiddis commented 2 months ago

The split processor is missing here: https://github.com/opensearch-project/OpenSearch/blob/54c13a6ae6afcdb18a8ffb1b1ef8044ec3b1ce56/modules/search-pipeline-common/src/main/java/org/opensearch/search/pipeline/common/SearchPipelineCommonModulePlugin.java#L89-L104

It was correctly added here: https://github.com/opensearch-project/OpenSearch/pull/14800/files#diff-13c9de4d1b7eaef9dac008ddbff90b83b84e00dc13c2547c6c67484f26efc1e2

But then I likely messed up a rebase / merge conflict resolution and removed it here: https://github.com/opensearch-project/OpenSearch/pull/14785/files#diff-13c9de4d1b7eaef9dac008ddbff90b83b84e00dc13c2547c6c67484f26efc1e2

Haven't tried but looks like adding the processor to search.pipeline.common.response.processors.allowed may be a workaround.

Unfortunately this is not a dynamic setting, so it looks like it needs to be added to opensearch.yml before startup.

CC: @ohltyler

dbwiddis commented 2 months ago

Haven't tried but looks like adding the processor to search.pipeline.common.response.processors.allowed may be a workaround.

Annnnd, it's not. The missing code filters that setting removing anything that's not in that internal map. So there is no workaround in 2.16.

dblock commented 2 months ago

@dbwiddis Update the docs to say 2.17 for it before too many users try it?