elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.53k stars 24.61k forks source link

Mapping conflict indexing array with float and long values #96950

Open AdrianConev opened 1 year ago

AdrianConev commented 1 year ago

Elasticsearch Version

8.8

Installed Plugins

No response

Java Version

bundled

OS Version

MacOS

Problem Description

In a scripted upsert array object containing multiples of same field that vary in type will fail. This does not happen in Elastic 5.6

Steps to Reproduce

Here's the following upsert. (Apologies for postman formatting)

POST localhost:9200/test-index/_update/1/

{ "scripted_upsert": true, "script": { "source": "for (entry in params.data.entrySet()) { ctx._source[entry.getKey()] = entry.getValue() }", "lang": "painless", "params": { "data": { "nested": [ { "attributes": { "price": 30.0 } }, { "attributes": { "price": 20 } } ] } } }, "upsert": {} }

returns { "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "mapper [nested.attributes.price] cannot be changed from type [float] to [long]" } ], "type": "illegal_argument_exception", "reason": "mapper [nested.attributes.price] cannot be changed from type [float] to [long]" }, "status": 400 }

In Elastic 5.6 this is not the case. Identical body goes through with changed header, of course, because of the type changes. So using this header instead

POST localhost:9200/test-index/item/1/_update/

This goes through in 5.6. Is this a side effect of an intended change? If so, is there a workaround or a feature I am not aware of that would solve this?

Logs (if relevant)

No response

thecoop commented 1 year ago

There have been several significant changes in mapping since 5.6 (for example, https://www.elastic.co/guide/en/elasticsearch/reference/7.17/removal-of-types.html). It is likely you'll need to re-write your scripts for v8.

I have labelled the search team to see if they have any further information on this functionality.

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-search (Team:Search)

javanna commented 1 year ago

@AdrianConev do you run that on an empty index or do you have previously indexed documents, as well as existing mappings?

AdrianConev commented 1 year ago

Both were tested on empty indices/mappings and "dirty ones", are you unable to reproduce this? @javanna

AdrianConev commented 1 year ago

Hi, any updates? I would like to further add that if the original request is split into two, the error doesn't happen. So these two requests in succession would go through

{ "scripted_upsert": true, "script": { "source": "for (entry in params.data.entrySet()) { ctx._source[entry.getKey()] = entry.getValue() }", "lang": "painless", "params": { "data": { "nested": [ { "attributes": { "price": 30.0 } } ] } } }, "upsert": {} }

{ "scripted_upsert": true, "script": { "source": "for (entry in params.data.entrySet()) { ctx._source[entry.getKey()] = entry.getValue() }", "lang": "painless", "params": { "data": { "nested": [ { "attributes": { "price": 20 } } ] } } }, "upsert": {} }

This makes me strongly suspect this is a bug.

javanna commented 3 months ago

This isn't an upsert or script specific problem. If you try to index what the script outputs, you get the same error:

POST localhost:9200/test/_doc/1
{ 
  "price2" : [30.0, 20] 
}

We end up adding a dynamic float mapper, and later a dynamic long mapper for the same field, and we fail indexing because of that. I would recommend in this case updating the params of the script, or the document itself to use the same type in all the items of the provided array.

What is rather strange is that if you index two documents that have a float and a long value with the same name, everything works fine, in that the input is coerced to the type selected based on the which field type was dynamically mapped (the first indexed document decides the type).

elasticsearchmachine commented 2 months ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)