elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.36k stars 24.87k forks source link

x_content_parse_exception exceptions should contain the document id that failed to parse. #48432

Open niemyjski opened 5 years ago

niemyjski commented 5 years ago

Elasticsearch version (bin/elasticsearch --version): 7.4.1

Plugins installed: ["mapper-size"]

JVM version (java -version): latest docker image.

OS version (uname -a if on a Unix-like system): latest docker image.

Description of the problem including expected versus actual behavior:

I'm doing an external reindex from Elasticsearch 5.6.16 to 7.4.1 and Its working great except for the one index that has 3 million documents in it and I'm randomly getting a parse exception on.... Sure would be great to know what document it is so I could look at it....

I'm not sure on the steps to reproduce other than start an external reindex and get the task status (how I'm seeing this error). If I knew the document I could post it with the mapping.

Provide logs (if relevant):

Successful (200) low level call on GET: /_tasks/hjNNxRi_SQSSihj20T6Vbg%3A4889?pretty=true&error_trace=true&wait_for_completion=false
# Audit trail of this API call:
 - [1] HealthyResponse: Node: http://localhost:9200/ Took: 00:00:00.0110890
# Request:
<Request stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
# Response:
{
  "completed" : true,
  "task" : {
    "node" : "hjNNxRi_SQSSihj20T6Vbg",
    "id" : 4889,
    "type" : "transport",
    "action" : "indices:data/write/reindex",
    "status" : {
      "total" : 50501,
      "updated" : 0,
      "created" : 2000,
      "deleted" : 0,
      "batches" : 2,
      "version_conflicts" : 0,
      "noops" : 0,
      "retries" : {
        "bulk" : 0,
        "search" : 0
      },
      "throttled_millis" : 0,
      "requests_per_second" : -1.0,
      "throttled_until_millis" : 0
    },
    "description" : "reindex from [host=docker.for.mac.localhost port=9210 pathPrefix=/ query={\n  \"match_all\" : {\n    \"boost\" : 1.0\n  }\n}][test-search-v44] to [local-test-search-v44][_doc]",
    "start_time_in_millis" : 1571866634051,
    "running_time_in_nanos" : 353592096200,
    "cancellable" : true,
    "headers" : { }
  },
  "error" : {
    "type" : "exception",
    "reason" : "Error parsing the response, remote is likely not an Elasticsearch instance",
    "caused_by" : {
      "type" : "x_content_parse_exception",
      "reason" : "[1:6032145] [search_response] failed to parse field [hits]",
      "caused_by" : {
        "type" : "x_content_parse_exception",
        "reason" : "[1:6032145] [hits] failed to parse field [hits]",
        "caused_by" : {
          "type" : "x_content_parse_exception",
          "reason" : "[1:6032145] [hit] failed to parse field [_source]",
          "caused_by" : {
            "type" : "parsing_exception",
            "reason" : "[hit] failed to parse [_source]",
            "line" : 1,
            "col" : 6032145,
            "caused_by" : {
              "type" : "json_parse_exception",
              "reason" : "Duplicate field 'comments'\n at [Source: org.apache.http.nio.entity.ContentInputStream@1ea0ba31; line: 1, column: 6032160]",
              "suppressed" : [
                {
                  "type" : "illegal_state_exception",
                  "reason" : "Failed to close the XContentBuilder",
                  "caused_by" : {
                    "type" : "i_o_exception",
                    "reason" : "Unclosed object or array found"
                  }
                }
              ]
            }
          }
        }
      }
    }
  }
}
elasticmachine commented 5 years ago

Pinging @elastic/es-search (:Search/Mapping)

niemyjski commented 5 years ago

Any updates on this? Or is there a better way to figure out what document is causing this behavior?

niemyjski commented 5 years ago

This is still happening on elastic 7.4.2

elasticmachine commented 2 years ago

Pinging @elastic/es-distributed (Team:Distributed)

javanna commented 2 years ago

This seems to me like a problem that's specific to the reindex API. When you are indexing a document, either through the index API or the bulk API, you'd get back the error and it would be possible to associate it with what document caused it. I believe the reindex API should be adapted to do something similar, probably wrap the exception or have better reporting around errors.

undermyumbrella1 commented 1 year ago

Can I try taking this issue? It's my first time trying oss contributions

undermyumbrella1 commented 1 year ago

Hi @niemyjski @javanna, would the following response format be alright?

    "error": {
        "root_cause": [
            {
                "type": "exception",
                "reason": "Error parsing the response, remote is likely not an Elasticsearch instance"
            }
        ],
        "type": "exception",
        "reason": "Error parsing the response, remote is likely not an Elasticsearch instance",
        "caused_by": {
            "type": "x_content_parse_exception",
            "reason": "[1:505] [search_response] failed to parse field [hits]",
            "caused_by": {
                "type": "x_content_parse_exception",
                "reason": "[1:505] [hits] failed to parse field [hits]",
                "caused_by": {
                    "type": "x_content_parse_exception",
                    "reason": "[1:505] Error occured at index: megacorp document id: 127",
                    "caused_by": {
                        "type": "x_content_parse_exception",
                        "reason": "[1:505] [hit] failed to parse field [_source]",
                        "caused_by": {
                            "type": "parsing_exception",
                            "reason": "[hit] failed to parse [_source]",
                            "line": 1,
                            "col": 505,
                            "caused_by": {
                                "type": "json_parse_exception",
                                "reason": "Duplicate field 'age'\n at [Source: (org.apache.http.nio.entity.ContentInputStream); line: 1, column: 514]",
                                "suppressed": [
                                    {
                                        "type": "illegal_state_exception",
                                        "reason": "Failed to close the XContentBuilder",
                                        "caused_by": {
                                            "type": "i_o_exception",
                                            "reason": "Unclosed object or array found"
                                        }
                                    }
                                ]
                            }
                        }
                    }
                }
            }
        }
    },
    "status": 500
}

the document id will be at the line: "reason": "[1:505] Error occured at index: megacorp document id: 127"

Rationale:

Proposed solution:

It is my first time trying this out, do let me know if there are better solutions, thank you!

Kiriakos1998 commented 1 year ago

Hello, @ellaella12 are you still working on this issue or have you introduced any PR trying to solve it? If not I would love to try fixing this issue as my first contribution to elasticsearch.

Kiriakos1998 commented 1 year ago

Hi @javanna, does this issue persist? If yes and no PR attempts to solve it I would love to try solving it.

Utsavk commented 1 year ago

Hi @javanna , Can I try to solve it as it is going to be my first contribution