elastic / elasticsearch-migration

This plugin will help you to check whether you can upgrade directly to the next major version of Elasticsearch, or whether you need to make changes to your data and cluster before doing so.
290 stars 32 forks source link

Add documentation on how to recover from a reindex failure in migration plugin #69

Closed ppf2 closed 7 years ago

ppf2 commented 7 years ago

Have an index from 1.x that fails with TypeError: Cannot read property '0' of undefined (when using the migration plugin 2.x too reindex). Migration plugin reports an error (in this case, the 2 indices failed with 2 different errors, both are marked Error from the migration plugin):

image

The above message is a bit misleading and can be confused with the health of the actual index in the cluster, which is actually green.

image

But in this case, it is actually talking about the index status of the migration reindexing process as recorded in the .reindex-status index:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": ".reindex-status",
        "_type": "index",
        "_id": ".watches",
        "_score": 1,
        "_source": {
          "reindex_status": "error",
          "aliases": {},
          "refresh": "1s",
          "replicas": "0",
          "task_id": "JeyTqv9uTdOnlsl1eMIvVQ:1769",
          "error": "Health of index `.watches` is `missing`, not `green`. Not resetting."
        }
      },
      {
        "_index": ".reindex-status",
        "_type": "index",
        "_id": ".marvel-2016.10.12",
        "_score": 1,
        "_source": {
          "reindex_status": "error",
          "aliases": {},
          "refresh": "1s",
          "replicas": "0",
          "task_id": "JeyTqv9uTdOnlsl1eMIvVQ:2526",
          "error": "Health of index `.marvel-2016.10.12` is `missing`, not `green`. Not resetting."
        }
      }
    ]
  }
}

In order to recover from this, the end user will have to delete the records in the .reindex-status before the UI will allow them to retry the reindexing from the UI. And if they want to try the manual steps from the info icon, they will have to delete any previously generated <index_name>-<version> index from the cluster before proceeding.

Perhaps we can add some information on the above to the reindex helper description text?

clintongormley commented 7 years ago

Instead, I've fixed the underlying bug so that the reset process will correctly delete the new index