elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
68.52k stars 24.33k forks source link

`GET` api does not return `_ignored` by default #107750

Open salvatore-campagna opened 2 months ago

salvatore-campagna commented 2 months ago

Elasticsearch Version

8.5.0 and above

Installed Plugins

No response

Java Version

bundled

OS Version

All

Problem Description

I discovered this issue while working on #101373. The GET api used to return the _ignored field by default and did so up to version 8.4.3. From 8.5.0 the _ignored field is not returned anymore by default and requires users to explicitly ask for the _ignored field to be included by adding _ignored to stored_fields.

Steps to Reproduce

I used the following test to reproduce the issue

# Check Elasticsearch version
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X GET "https://localhost:9200?pretty"

# Create a mapping suitable to store ignored fields
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X PUT "https://localhost:9200/test-index?pretty" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "age":    { "type": "integer", "ignore_malformed": true },  
      "email":  { "type": "keyword", "ignore_above": 128  }, 
      "name":   { "type": "keyword", "ignore_above": 10  }     
    }
  }
}'

# Check the mapping is ok
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X GET "https://localhost:9200/test-index/_mapping?pretty"

# Index a document with an ingored value (`age` is expected to be numeric)
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X PUT "https://localhost:9200/test-index/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
  "age": "unknown",
  "email": "bob@gmail.com",
  "name": "bob"
}'

# Verify if `_ignored` is returned
curl --cacert config/certs/http_ca.crt -u elastic:$ELASTIC_PASSWORD -X GET "https://localhost:9200/test-index/_doc/1?pretty"

Testing this with version 8.4.3 and 8.5.0 reveals a difference in behaviour which was never reported as a breaking change for 8.5.0. Results are as follows for 8.4.3:

{
  "_index" : "test-index",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "_ignored" : [
    "age"
  ],
  "found" : true,
  "_source" : {
    "age" : "unknown",
    "email" : "bob@gmail.com",
    "name" : "bob"
  }
}

while as follows for 8.5.0:

{
  "_index" : "test-index",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "age" : "unknown",
    "email" : "bob@gmail.com",
    "name" : "bob"
  }
}

Logs (if relevant)

No response

elasticsearchmachine commented 2 months ago

Pinging @elastic/es-search (Team:Search)