elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.93k stars 24.74k forks source link

"cluster settings" JSON structure is ambiguous with dotted `.frozen` settings #85789

Closed PhaedrusTheGreek closed 2 years ago

PhaedrusTheGreek commented 2 years ago

Elasticsearch Version

7.17.0

Installed Plugins

No response

Java Version

bundled

OS Version

Darwin

Problem Description

GET _cluster/settings?include_defaults outputs ambiguously structured field names:

      ...
      "max_shards_per_node.frozen" : "3000",
      ...
      "max_shards_per_node" : "1000",
      ...

Additionally, cluster.routing.allocation.disk.watermark.flood_stage vs cluster.routing.allocation.disk.watermark.flood_stage.frozen, and possibly more.

While the JSON itself is valid, it cannot be indexed:

"error": {
  "type": "illegal_argument_exception",
  "reason": "can't merge a non object mapping [cluster.max_shards_per_node] with an object mapping"
}

I'm not sure if this also presents a problem on PUT _cluster/settings In ES versions prior to having the .frozen fields, the data was indexable as-is.

Having indexable output is important for diagnostic analysis / configuration validation. A workaround to convert dots to another character might work, but doesn't seem like the right solution.

Steps to Reproduce

POST _bulk
{ "create": { "_index": "test-index" } }
{"cluster.max_shards_per_node": "1000", "cluster.max_shards_per_node.frozen": "3000"}

Logs (if relevant)

No response

elasticmachine commented 2 years ago

Pinging @elastic/es-core-infra (Team:Core/Infra)

rjernst commented 2 years ago

While the JSON itself is valid, it cannot be indexed

AFAIK we don't make the promise the output of cluster settings can be indexed. The examples you found are simple two different settings. Since the settings internally are a flat map, overlapping nested structure does not matter, so there is nothing stopping this from occurring in the settings infrastructure.

VimCommando commented 2 years ago

That is a little frustrating. In my test cluster I found seven different values in the cluster settings that run into this because they have a . in the setting name.

{
  "defaults" : {
    "cluster" : {
      "max_shards_per_node.frozen" : "3000",
      "routing" : {
        "allocation" : {
          "disk" : {
            "watermark" : {
              "flood_stage.frozen.max_headroom" : "20GB",
              "flood_stage.frozen" : "95%"
            }
          }
        }
      },
      "searchable" : {
        "snapshot" : {
          "shared_cache" : {
            "size.max_headroom" : "-1"
          }
        }
      }
    },
    "transport" : {
      "type.default" : "netty4"
    },
    "discovery" : {
      "zen" : {
        "ping" : {
          "unicast" : {
            "hosts.resolve_timeout" : "5s"
          }
        }
      }
    },
    "http" : {
      "type.default" : "netty4"
    }
  }
}

Once flattened there isn't a way to infer where the object stops and the setting name starts.

elasticmachine commented 2 years ago

Pinging @elastic/es-search (Team:Search)

rjernst commented 2 years ago

I'm moving this over to the search team, who may be working on something that will solve this problem, by allowing object fields (the intermediate field names) to also have concrete mappings.

javanna commented 2 years ago

This is addressed by #86166 . Note that documents that hold fields in the shape of e.g. "max_shards_per_node.frozen" : "3000", "max_shards_per_node" : "1000" require to be mapped disabling dot expansion for field names that contain dots. In the case of settings, it looks like subobjects: false should be configured for the root object.