elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.51k stars 24.9k forks source link

Atomically Rename Index #37880

Open pickypg opened 5 years ago

pickypg commented 5 years ago

This is a duplicate of #17426, which was closed back in March, 2018. However, things have changed rather significantly where this feature has become more of a usability issue than a convenient way to fix an index's name.

Problem

With the introduction of both Rollover, particularly within ILM, the need to know a preexisting alias "name" is largely core to the usage of the API and using it is incredibly useful.

However, in a lot of very common usage, users may find themselves not having enough knowledge of ES to understand the bootstrapping requirement, and more importantly they may not be able to stop an automated service once the bootstrapping has been setup (e.g., imagine a bunch of Beats pushing to mybeatindex, but you delete the backing indices). Imagine having to stop all of those Beats instances just to create the alias again (noting that they would try to create the index using the alias' name and then you would struggle to create a new index with that alias), or imagine someone deleting all of their indices and not resetting the alias before the next payload arrives.

Solution

Although much easier said than done, it would be incredibly helpful if the aliases API could atomically rename an index as part of setting up aliases.

If we supported that behavior, then any tool talking to ES could blindly assume that an alias exists -- no bootstrapping required -- and features like Rollover could automatically setup the alias in the same step as pointing its alias to the next index. This dodges all of the frustration with handling the alias in a scenario where things are already running.

Alternative Solution

A possible alternative may be to support renaming an index on creation via its template if it matches the alias. This could possibly be akin to kicking off rollover to get its non-colliding index pattern.

Partial Workaround

For users that need to delete all of their indices behind a Rollover alias, the right solution is to pre-create a new index and point the alias to it before deleting the other indices. However, this again comes back to understanding the bootstrapping process.

elasticmachine commented 5 years ago

Pinging @elastic/es-core-features

pickypg commented 5 years ago

I have run into another need for this:

Weeks worth of weekly metricbeat indices were mapped incorrectly (missing the template), which breaks the Infra UI. The solution is pretty simple: reindex and apply an alias for today.

However, I cannot reindex with live indexing in play, thus my options are:

Neither option really gives me a complete solution, but if I could atomically rename the index while applying an alias to its replacement index, then I could divert indexing to the aliased indexing and reindex into it at the same time with zero downtime, zero issues, and an immediately working UI with data being backfilled.

dakrone commented 5 years ago

We discussed this in the core/features meeting and decided that while we would definitely like to do this. It's technically challenging. I spoke with Simon about it and one of the prerequisites for being able to easily do this is to switch all the instances where we use a String to represent an index to the actual Index class.

For now I'm removing the discuss and leaving this as high-hanging fruit.

jcttrll commented 5 years ago

This may be impractical, but one option I looked for in the index templates feature is forcing the index name at creation time, as well as applying an alias. The force-name feature doesn't exist, however.

As a concrete example, imagine LogStash (or anything else) writes to indexes like app-2019-08-20. It takes advantage of automatic index creation. However, what I really want is for LogStash to write to the alias app-2019-08-20, which corresponds to some index (some other name I couldn't care less about--could be randomly generated, for all I care). This enables me to atomically flip that alias to another index, then reindex the real index into the second index. But LogStash--like other things taking advantage of automatic index creation--doesn't have the ability to create index A and then write to index (alias) B forever afterward. It just starts writing to A and always writes to A.

This isn't extremely well thought through, but my idea is something like this in the index template:

{
    "order": 10,
    "index_patterns": [
      "app-*"
    ],
    "rename": "{index}-something",
...
}

Given the index name app-2019-08-20, this would create an index named app-2019-08-20-something with an alias of app-2019-08-20.

cjcenizal commented 2 years ago

CC @elastic/kibana-stack-management

We have a problem with Upgrade Assistant (https://github.com/elastic/kibana/issues/120137), in which a "rename in place" API would solve our problem.

The problem

In 7.16 and 7.17, Upgrade Assistant reindexes indices from 6.x to make them compatible with 8.x prior to the upgrade to 8.x. It does this by reindexing source to reindexed-source, creating a source->reindexed-source alias, and then deleting the original source index. This work is opaque to the user, and they're never notified about the resulting renamed index or alias.

This creates a subtle change in the perceived behavior of index APIs. A user who is used to working with one of these aliases might attempt to delete this index after it's been reindexed by Upgrade Assistant. They'll get back a 404 because this index no longer exists.

"This is confusing!" thinks the user. 🤔 "I can still search the index, why can't I delete it?" We know search works via the alias, but the user doesn't. "What's going on here?" they wonder. The user will need to do some digging and might even be able to figure out what's happening, but they probably won't understand why they now have an alias pointing to some renamed version of their original index.

Proposed solution

The original issue (https://github.com/elastic/elasticsearch/issues/17426) gave this series of pseudo commands as the desired workflow:

_reindex a->b
_close b
DELETE a
_rename b->a
_open a

This is very close to what @sebelga and I would like to do. If Upgrade Assistant could simply delete the original index and then rename the reindexed index with the original index's name, then the alias can be removed and the perceived behavior of the index APIs remains unchanged from the user's point of view.