elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
68.58k stars 24.36k forks source link

Make total fields limit less of a nuisance to users #89911

Open javanna opened 1 year ago

javanna commented 1 year ago

Since version 5.0, every index created within Elasticsearch has a maximum total number of fields defined in its mappings, which defaults to 1000. Fields can be manually added to the index mappings through the put mappings API or via dynamic mappings by indexing documents through the index API. A request that causes the total fields count to go above the limit is rejected, whether that be a put mappings, a create index or an index call. The total fields limit can be manually increased using the update index settings API.

The main reason why the total fields limit was introduced (see #11443) is to prevent a mappings explosion caused by ingesting a bogus document with many JSON keys (see #73460 for an example). A mappings explosion impacts the size of the cluster state, the memory footprint of data nodes, and hinders stability.

While the total fields limit is a safety measure against mappings explosion, it is not an effective solution to prevent data nodes from going out of memory due to too many fields being defined: it's an index based limit, meaning that you can have 10k indices with 990 fields each without hitting the limit, yet possibly running into problems depending on the available resources, but a single index with 1000 fields is not allowed. Data nodes load mappings only for the indices that have at least one shard allocated to them, which makes it quite difficult to have a reasonable limit to effectively prevent data nodes from going out of memory.

It is quite common for users to reach the total fields limit, which causes ingestion failures, and consequent need to increase the total fields limit. Our Solutions (e.g. APM) increase the total fields limit too. The fact that many users end up reaching the limit despite they are not ingesting bogus documents sounds like a bug: ideally the limit would be reached only with a very high number of fields that is very likely to be caused by a bogus document, and no user would have to know about or increase the limit otherwise.

The total fields limit has been around for quite some time, so it may very well be that the 1000 default was reasonably high when it was introduced, but it turned out to be too low over time. Possibly all the recent improvements made in the cluster state handling area on dealing with many shards and many indices have also helped supporting more fields in the mappings. An area of improvement is the memory footprint of mappings within data nodes (see #86440), and once we improve that we will be able to support even more fields, yet this is a tangential issue given that the current limit does not prevent data nodes from going out of memory.

I'd propose that we consider making the following changes, with the high-level goal of making the total fields limit less visible to users, yet while being still effective for its original goal:

Is there any preparation work needed to feel confident that making the total fields limit more permissive does not cause problems? Could we end up allowing for situations that would have previously legitimately hit the limit?

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-search (Team:Search)

mitar commented 1 year ago

apply the total fields limit only to dynamic mappings update

I think this would be a reasonable approach. Especially if you could configure limit on sub-documents as well (some parts of a document might be coming from users and you want to limit dynamic mappings there).

javanna commented 1 year ago

We discussed with the team, and we said the following:

original-brownbear commented 1 year ago

++ to the above, as discussed on another channel.

Applying the limit to dynamic mapping updates only seems to be the way to go to me too. We have a lot of built-in mappings with far more than 1k fields in our own products already.

For dynamic mapping updates I think a higher limit may make sense if this is something that's actually causing trouble for users. The only reservation I had in this regard was that unlike a static mapping of thousands of fields, a dynamic mapping of thousands of fields will have data in every field. This causes higher memory use from Lucene data structures than just having unused fields in a static mapping. As we discussed on another channel, that kind of issue shouldn't be addressed by the default value of this setting though but rather by other means of reducing per-field overhead.

felixbarny commented 1 year ago

Huge +1 on everything that was said here.

Just one addition that wasn't discussed yet in this thread. I think that the biggest pain with the field limit is that it causes data loss. Therefore, there should be a mode where hitting the field limit doesn't lead to rejecting the document but to not adding additional dynamic fields once the limit has reached. In other words it should be possible to index the first 1000 dynamic fields and after that, the additional fields would just be stored but not added to the mapping (similar to dynamic: false).

This would resolve a huge pain point that we have for logging and tracing data where bogus documents or misuse of an API can lead to data loss. As in APM, multiple services share the same data stream, a single misbehaving service can cause data loss for all services.

felixbarny commented 5 months ago

Update: We've merged

This addresses the document loss when adding dynamic fields beyond the limit. It doesn't cover the part to apply the field limit only to dynamic mappings updates.

jackgray commented 2 months ago

Is there a way to change this setting on all future indices? I don't understand how to prevent document loss if there is no IaC way of setting this rule before the index is created and starts skipping fields. How would I apply this rule to sharded indices, where streams are piped to an index name with a timestamp?

It seems that this setting cannot be defined in any configuration file like most others, or in the stack management advanced settings.... This has been a massive barrier in Elasticsearch usability for our use with very little documentation. Our ingest data haven't even exceeded 4GB at this stage.

mitar commented 2 months ago

It is easy to specify this in your index configuration:

{
  "settings": {
    "index.mapping.total_fields.limit": 20000
  },
  "mappings": {

  }
}
felixbarny commented 2 months ago

Hey @jackgray, you can use index templates to define the mappings for an index pattern before these indices exist.

Elasticsearch also ships with a default index template for logs-*-*, which has the new index.mapping.total_fields.ignore_dynamic_beyond_limit setting preconfigured. So if you send logs to logs-generic-default, or logs-myapp-default, you get the recommended default settings, including ignore_dynamic_beyond_limit automatically.

Besides that, the default index template for logs-*-* also creates a data stream, which is recommended over index names with a timestamp.