elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.03k stars 24.83k forks source link

dynamic field mapping plus strict dynamic mapping #12358

Open djschny opened 9 years ago

djschny commented 9 years ago

It appears that one cannot force strict dynamic mapping while also having field dynamic mappings.

For example what if one wants to mapping that all fields that start with "s_*" are valid and should be mapped to string, but reject anything that does not start with that, such as "foobar". Consider the following mapping:

PUT _template/mystack_wildcard
{
  "template": "mystack*",
  "mappings": {
    "question": {
      "dynamic": "strict",
      "dynamic_templates": [
        {
          "strings": {
            "match": "s_*",
            "mapping": { "type": "string" }
          }
        }
      ]
    }
  }
}

I would expect the first index below to succeed but the second to fail.

Expected succeed:

PUT /mystack_foo/question/1
{
  "s_name": "This should work"
}

Expected to fail:

PUT /mystack_foo/question/1
{
  "foobar": "This should be ignored"
}

Not sure if I'm misinterpreting or if there is an issue, but seems like what was mentioned above should be possible and valid.

jpountz commented 9 years ago

I think both should fail given that we need to map some fields dynamically while dynamic is strict? We could potentially add a new option that would only allow dynamic mappings if there is an explicit matching template.

There are several other open issues that are trying to find middle grounds between dynamic: strict and dynamic: true, for instance #11443 proposes a maximum number of dynamically mapped fields, and I think I remember someone proposing to limit dynamically mapped fields to a regular expression but I can't find the issue number. Recently, we have been working on removing options from mappings so I would be careful before adding new ones. In particular, I'd like to make sure it would be useful to a significant part of the user base so that it would be worth maintaining.

djschny commented 9 years ago

From an external perspective it is confusing IMO, as the user has entered a definition in the mapping that matches on fields that start with "s_*", therefore Elasticsearch is not looking at data and inferring what new field to map to. Instead the field coming in matches a user defined mapping.

I don't think a new option is needed here, but rather a correction of the behavior of the existing mappings. I hope I am articulating myself correctly.

rjernst commented 9 years ago

Instead the field coming in matches a user defined mapping

No, the data matches a template for a field. s_* is not a field name, it is a pattern to match when creating a field dynamically. Hence, setting dynamic: false would mean not allowing dynamic mappings (adding fields dynamically).

I agree with @jpountz that this could be solved through an additional setting, but I am very weary of adding such a setting.

clintongormley commented 9 years ago

I think a clean way to do it would be to leave the dynamic: true setting in place, but allow a final rule in the dynamic templates that says either: ignore this field or throw an error.

sb1977 commented 9 years ago

+1 for the rule to ignore a field in dynamic templates

fabiank88 commented 8 years ago

+1 for the ignore field in dynamic templates!

gmoskovicz commented 8 years ago

+1 for this!

I think that we need a way to discard fields.

Another option is to add a new dynamic setting as a pattern like the Automatic Index Creation. My proposal is that dynamic has 4 possible values:

@clintongormley @djschny @rjernst @jpountz how does it sounds?

rjernst commented 8 years ago

I think that adds more complication than we should deal with. We already have the ability to control dynamic on a per-field-and-subfields basis. If we were to allow a pattern based approach, then it should be a single setting at the root of the mapping, otherwise there are weird cases you can get into (what if the setting disagrees with the explicitly set dynamic in a field already existing that matches the pattern).

rjernst commented 8 years ago

Note that by "single field at the root of the mapping" i mean we would need to remove the ability to set dynamic within the mappings altogether, except at the root.

gmoskovicz commented 8 years ago

I see, @rjernst so maybe we need a new setting to avoid this. Currently one cannot control per level, which "dynamic" fields should be created or not.

gmoskovicz commented 8 years ago

Example:

POST /my_index
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "my_type": {
      "dynamic_templates": [
        {
          "item": {
            "match": "field_*", <------ i want at the root level, to ONLY create fields that starts with this name
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "analyzer": "english"
            }
          }
        }
      ]
    }
  }
}
clembac commented 7 years ago

hi @jpountz, do you know if this setting has been implemented in the new release ?

jpountz commented 6 years ago

@clembac It hasn't been implemented. I think @clintongormley 's proposal is reasonable here: add a new rule that allows to reject mapping updates when there is a match. I'm adding the discuss label. Here is a proposed syntax:

PUT _template/mystack_wildcard
{
  "template": "mystack*",
  "mappings": {
    "question": {
      "dynamic": "strict",
      "dynamic_templates": [
        {
          "strings": {
            "match": "s_*",
            "mapping": { "type": "string" }
          }
        },
        {
          "default": {
            "match": "*",
            "mappings": null
          }
        },
      ]
    }
  }
}

cc @elastic/es-search-aggs

tomcallahan commented 6 years ago

We discussed this issue in FixItFriday and everyone agreed that Clint's proposal was reasonable.

markw commented 5 years ago

Is this a feature that's available in a recent release? We were trying to figure out how to do the same thing.

mayya-sharipova commented 5 years ago

@markw We still need to develop this feature

gabriele83 commented 5 years ago

There's news about that? it is from 2015 that we talk about this problem

jetnet commented 4 years ago

any news? I'd like to have dynamic templates, but reject docs with fields, which are not explicitly defined or do not match the defined dynamic templates. New option for dynamic would be:

"dynamic": "templates"

which means: strict + matched dynamic fields Thanks a lot!

tomhe commented 4 years ago

which means: strict + matched dynamic fields

I need false + matched dynamic fields.

teebu commented 4 years ago

What's the status on this?

Hronom commented 3 years ago

I'm from 2021 what the status? It's very needed

dorian-marchal commented 2 years ago

I found a workaround for an equivalent of dynamic: strict that accept matched dynamic fields: Adding this final dynamic_templates entry that matches everything returns an error when an unknown dynamic field is added:

PUT my-index
{
  "mappings": {
    "dynamic": false,
    "properties": {
      "static_property": { "type": "keyword" },
      "dynamic_properties": { "dynamic": true, "type": "object" }
    },
    "dynamic_templates": [{
      "dynamic_properties.<uuid>": {
        "match_pattern": "regex",
        "path_match": "^dynamic_properties\\.[a-z0-9-]{36}$",
        "mapping": { "type": "keyword" }
      }
    }, {
      "reject_others": {
        "match": "*",
        "mapping": { "dynamic": "strict" }
      }
    }]
  }
}

I suspect this relies on the mapping property being invalid but it works as expected:

POST /my-index/_doc
{ "static_property": "foo"}
-> Statically indexes `static_property`
POST /my-index/_doc
{ "unknown": {"a4541": "2021" }}
-> Ignores `unknown`
POST /my-index/_doc
{ "dynamic_properties": { "e581471a-e8b2-40d1-985f-e73cef2e0171": "bar" }}
-> Dynamically indexes `dynamic_properties.e581471a-e8b2-40d1-985f-e73cef2e0171`
POST /my-index/_doc
{ "dynamic_properties": { "unknown": "bar" }}
-> Throws an error: `"unknown parameter [dynamic] on mapper [unknown] of type [null]"`
natsen commented 1 year ago

This feature is more relevant now with elasticsearch support for runtime fields. Users can by default store all fields, experiment with runtime fields and map only fields that are found useful for the visualizations/use cases.

javanna commented 1 year ago

This feature is more relevant now with elasticsearch support for runtime fields.

Maybe, but there's also dynamic:runtime if you want to map everything as runtime by default.

abishekhkamath commented 4 months ago

I think a clean way to do it would be to leave the dynamic: true setting in place, but allow a final rule in the dynamic templates that says either: ignore this field or throw an error.

Hi Elastic Team @jpountz @javanna Any updates on this? I am looking for a way to allow an object to have dynamic subfields matching a certain pattern and ignore others (dynamic: false). For eg:

"dynamic_templates": [
         {
          "attribute.NAME$<lang>-lang-suffix": {
            "mapping": {
              "type": "text",
               "analyzer": "standard"
            },
            "match_pattern": "regex",
            "path_match": "^\\Qattribute.NAME$\\E(de|no|fi|ru|sv|ko|pt|en|it|fr|hu|zh|es|th|ja|da|ro|nl|tr)(_..)?$"
          }
        },
        {
          "catch-other-attributes": {
            "mapping": {
             "dynamic": false
            },
            "path_match": ["attribute.*"]
          }
        }
]
elasticsearchmachine commented 3 months ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)