elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
966 stars 24.82k forks source link

[Meta] Better handling of single-valued fields #80825

Open markharwood opened 2 years ago

markharwood commented 2 years ago

Background

For a long time elasticsearch has been very permissive about JSON documents and has made no distinction between single values and arrays of values. This permissive approach has several downsides: 1) Client code and scripts are made more complex. To be robust, code must be written to handle both single-valued fields and arrays of fields. 2) Kibana does some strange things. e.g. Kibana will happily try "AND" multiple values from a bar chart/pie chart which never makes sense for values taken from a single-valued field. This produces no matches because no document can be OS:ios and OS:android simultaneously 3) Administrators cannot easily "lock down" the mapping. Custom ingest scripts are required to prevent multi-valued documents being added (and ingest scripts can still be circumvented by clients sending documents?).

All of the above is unfortunate because the majority of fields in common use are single-valued. A weblog's fields are a good example (timestamp, IP, OS, user agent, URL, referrer, country etc are all single values).

Proposed changes

The solution is a 2-pronged approach : Enforcement: for new indices we can give administrators the option of rejecting documents with multiple-values. Reporting: for both new and old indices we can report if the index contains only documents with single values

elasticmachine commented 2 years ago

Pinging @elastic/es-search (Team:Search)

jpountz commented 2 years ago

Some thoughts on this proposal:

elasticsearchmachine commented 3 months ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)