Closed bhavyarm closed 2 years ago
Pinging @elastic/kibana-core (Team:Core)
- Go to upgrade assistant - reindex sample data
@bhavyarm is this step required to see the sample data on the 7.x
instance? What happens if going directly from step 1 to step 3?
@elastic/kibana-stack-management what exactly does this action do?
@pgayvallet Reindexing indices from 6.x will remove these deprecated settings from them:
Who owns the sample data? It seems like there's a bug in the logic because in @bhavyarm's screenshot, the "Add data" buttons are active:
If the sample data was installed in 6.8, then these buttons should be replaced with "Remove data" buttons. I think an engineer from the team that owns the sample data needs to take a look at this logic and identify the root cause of this unexpected behavior.
@cjcenizal we are allowed to install sample data in every space. Please note this is a new space I created in 7.16.
@pgayvallet if user upgrades to 7.16 from 6.8.x and creates a new space and installs sample data without touching upgrade assistant - everything works fine. But if user goes to upgrade assistant reindexes sample data and then creates a new space and tries to install sample data things break
These log lines are the problem.
log [12:28:23.154] [warning][home][plugins][sampleData] Unable to create sample data index "kibana_sample_data_ecommerce", error: invalid_index_name_exception: [invalid_index_name_exception] Reason: Invalid index name [kibana_sample_data_ecommerce], already exists as alias
Oh I missed the part about creating a new space! Thanks for pointing that out.
if user upgrades to 7.16 from 6.8.x and creates a new space and installs sample data without touching upgrade assistant - everything works fine. But if user goes to upgrade assistant reindexes sample data and then creates a new space and tries to install sample data things break
@bhavyarm In the scenario where the user did not use UA to reindex in 7.16, he still did install the sample data in the default space in 6.8, right? Just want to be sure, because given @cjcenizal's answer, if the UA reindex action just performs an update of the index's setting, the behavior should be the same with or without reindex from UA.
Given the error Reason: Invalid index name [kibana_sample_data_ecommerce], already exists as alias
, it makes me wonder if we weren't using an alias for these sample data in 6.x. @cjcenizal, can you please just (double) confirm that the UA's reindex action does not do fancy things on moving/creating aliases?
Also may be related to https://github.com/elastic/kibana/issues/116677
@pgayvallet @cjcenizal The reindexing action creates an alias to point from the old index to the new index. Here is an example of the process for an index test_data
:
test_data
to read onlyreindexed-v6.8-test_data
test_data
to reindexed-v6.8-test_data
test_data
that points to reindexed-v6.8-test_data
test_data
@bhavyarm In the scenario where the user did not use UA to reindex in 7.16, he still did install the sample data in the default space in 6.8, right?
@pgayvallet yes.
The reindexing action creates an alias to point from the old index to the new index. Here is an example of the process for an index test_data
So at the end of the reindex operation, the test_data
index is deleted, and there is a new test_data
alias pointing to reindexed-v6.8-test_data
. This is some disruptive action, that any consumer of the index has to be aware of and take into account when performing operations.
This causes the uninstall/reinstall logic for the sample data sets to fail: the indices.delete
operation fails (silently, sigh..) because the target is an alias, and then the indices.create
operation fails because an alias already exists with the same name.
We could eventually add additional logic to check if the target index is an alias first, and then delete both the alias and its target, but to be honest, that doesn't feel right. I don't think any upgrade-related operation should alter the state of indices in such a significant way. After a reindex, I would expect to have an index with the same name, not an alias pointing to another index. This is dangerous for any index-based operations (such as deletes).
After a reindex, I would expect to have an index with the same name, not an alias pointing to another index
I tend to agree with you, though this runs up against the current Elasticsearch model. Renaming an index in place isn't possible today, AFAIK. From what I can tell the prescribed method of renaming is to use an alias, or to do a full reindex.
We should verify that a second full reindex wouldn't be feasible for UA, but assuming that it isn't, perhaps the optimal path is to add that additional logic to Sample Data. This is the first time I believe we've encountered this problem, which I do find a bit surprising but there it is.
cc @LeeDr
We should verify that a second full reindex wouldn't be feasible for UA, but assuming that it isn't, perhaps the optimal path is to add that additional logic to Sample Data
If this is too much work, or just isn't possible to do from UA, I agree that we'll 'just' have to do it from the home plugin. I just hate the idea to add additional bullet-proofing logic to protect against changes in the indices/aliases that aren't performed from the owning plugin. I mean, what proof do we have that these sample data indices are the only ones impacted by this problem? I doubt these are the only data indices that can be created/deleted 'internally' by Kibana?
This is the first time I believe we've encountered this problem, which I do find a bit surprising but there it is.
I agree, this is surprising 😅
@sebelga Could you please investigate this? I see two main questions we want to answer.
To conclude Upgrade Assistant's reindex process with an updated index that has the same name as the original, we'd need to reindex the update index into a new index. We wouldn't need to make any changes to the destination index other than giving it a new name. What's the performance impact, and how does this performance change as the size of the index grows? If the performance impact is O(1) or O(log N) then a second full reindex is feasible. Otherwise, I'd say it isn't.
If Elastic has more logic that's similar to the Sample Data, in which datasets are installed and later detected through the presence of indices, then we'll probably trip over this again. For example, do any Integrations install datasets? If Upgrade Assistant reindexes those datasets, will that break their logic for detecting installed Integrations, uninstalling Integrations, or migrating them during version upgrades?
Is a second full reindex feasible?
Talking with @martijnvg it does seem that a bigger index will not linearly increase the time of reindex and is not an O(1) op. As per his comment:
"_in the hot threads a significant time is spent on looking up the id/version during reindexing. This is something that reindex does to ensure that no duplicates are indexed into the target index. But as the destination index grows so does the cost/time it takes to lookup this _id/version
_".
Where else might this problem manifest?
It seems that the issue we are discovering could occur on any dataset, not only the ones from our products/integrations, but also the ones from our users.
Seb and I chatted and we agree that the ideal solution would be to rename the index in place (https://github.com/elastic/elasticsearch/issues/37880). Unfortunately, the ES Data Management team tells me it's not possible to implement this for 7.17.
It's also not feasible to add a second reindexing phase as I originally proposed, because we're still stuck with the problem of having to create an index with a name that collides with an index alias. The ES Data Management team also says it's not feasible to provide an API option to force this.
As an outcome, we're left with three steps:
It's also not feasible to add a second reindexing phase as I originally proposed, because we're still stuck with the problem of having to create an index with a name that collides with an index alias.
If we add a second reindexing phase we don't need to create an alias, right?
Adding a second reindexing phase and keeping an index (instead of using an alias) seems the least intrusive change for our users. It would initially be less performant than a "rename in place API" but we'll update and use that API once it is ready. Thoughts @cjcenizal ?
We discussed this sync, so I'll record what we discussed here for visibility.
If we add a second reindexing phase we don't need to create an alias, right?
Right.
Adding a second reindexing phase and keeping an index (instead of using an alias) seems the least intrusive change for our users. It would initially be less performant than a "rename in place API" but we'll update and use that API once it is ready. Thoughts @cjcenizal ?
Agreed. The main concern is cost, since reindexing incurs data transfer costs.
Kibana version: 7.16.0 snapshot Dec 1st
Elasticsearch version: 7.16.0 snapshot Dec 1st
Server OS version: darwin_x86_64
Browser version: chrome latest
Browser OS version: OS X
Original install method (e.g. download page, yum, from source, etc.): from snapshots
Describe the bug: If user has sample data installed in 6.8.x and upgrades to 7.16. and reindexes 6.8 indices and then creates a new space and tries to install sample data - Kibana displays internal server error.
Steps to reproduce:
Screenshots (if relevant):
Errors in browser console (if relevant):
Provide logs and/or server output (if relevant):
Kibana logs: