opensearch-project / opensearch-java

Java Client for OpenSearch
Apache License 2.0
105 stars 169 forks source link

[BUG] Deserializing a custom analyzer without `type` specified fails #1032

Closed Xtansia closed 1 week ago

Xtansia commented 1 week ago

What is the bug?

It is possible to manually create a custom analyzer without the type property explicitly specified. Attempting to deserialize this with the client such as when retrieving the index with client.indices.get() it will fail due to no type property. The server's logic is that type must be specified unless tokenizer is specified in which case type: "custom" is assumed.

How can one reproduce the bug?

  1. Create an index with a custom analyzer in Dev Tools:
    PUT /custom-analyzer-index
    {
        "mappings": {
            "properties": {
                "text_chunk": {
                    "type": "text",
                    "analyzer": "custom_kuromoji_analyzer"
                }
            }
        },
        "settings": {
            "index": {
                "analysis": {
                    "analyzer": {
                        "custom_kuromoji_analyzer": {
                            "filter": [
                                "kuromoji_baseform",
                                "ja_stop"
                            ],
                            "char_filter": [
                                "icu_normalizer"
                            ],
                            "tokenizer": "kuromoji_tokenizer"
                        }
                    }
                }
            }
        }
    }
  2. Execute client.indices().get(b -> b.index("custom-analyzer-index"))

What is the expected behavior?

Analyzer to successfully deserialize as a CustomAnalyzer.