Closed stratoula closed 5 months ago
Pinging @elastic/kibana-visualizations @elastic/kibana-visualizations-external (Team:Visualizations)
I ran the migration for a dashboard that was reported as broken and it run successfully so I don't think that there is a problem on the migration itself. Also this problem doesn't happen always. Furthermore I am not sure that it is this specific migration that doesn't run. Possibly the migration fails somewhere else but this is the most disrupting change so this is the only problem that is visible. It is very possible that other migrations don't run also.
@elastic/kibana-core do you have any idea why the migrations might not run in some cases?
commonMigrateIndexPatternDatasource
is an embeddable migration right, given I couldn't find it directly in the dashboard migrations?
Which version is it registered for?
Looking at the associated support issue, and the faulty dashboard SO:
{
// ....
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "8.7.0",
}
Possibly the migration fails somewhere else but this is the most disrupting change so this is the only problem that is visible
Any error during document migration would have caused the migration to fail. But also, it wouldn't have updated the typeMigrationVersion
or coreMigrationVersion
of the document.
Except, of course, if the registered type migration fails silently, e.g with a try / catch / trap
pattern
do you have any idea why the migrations might not run in some cases?
As said, the first option coming to mind would be the registered migration failing silently.
~Given it's an embeddable migration, deferred migrations implemented in #153117 also comes to mind. But AFAIK this is not yet used in 8.8 so I doubt it could be it.~
EDIT: nevermind, https://github.com/elastic/kibana/pull/153117 was merged in 8.9.0
, can't be the culprit.
@rudolf any other idea?
It was registered in 8.6.0, here it is https://github.com/elastic/kibana/blob/main/x-pack/plugins/lens/server/migrations/saved_object_migrations.ts#L592 the migrateIndexPatternDatasource
but as I said it is the most disruptive one (as if it doesn't run the Lens SOs are broken)
It runs both for embedded visualizations but also for Lens SOs. (as all our migrations)
This specific one doesn't have a try catch and is a very simple one. No we still using the server migrations that run on kibana upgrade.
It runs both for embedded visualizations but also for Lens SOs. (as all our migrations)
But the problem we observed is only about embedded vis, right? Or did they also report the problem on "root" visualizations?
@pgayvallet correct, so far three customers have reported to us the same failure and all have reported dashboards with by value visualizations (no reference to Lens SO, everything is saved on the dashboard SO)
so far three customers have reported to us the same failure
I was only aware of one of the support issue. Would you mind backlinking from the sdhs to this issue so we can track them?
The first thing I would check is: which versions did those customers upgraded from and to? If all those were reported during upgrades to 8.8.0
, we might indeed have an issue somewhere, either in the persistable state migration system or in the SO migration itself.
@pgayvallet I pinged all the others, 4 in total. The latest is an upgrade to 8.8 but the other 3 are related to upgrades to 8.6.x They upgraded from different 8.x versions.
Great, thanks for the backlinks.
So the problem seems to occur for upgrades to any version including this commonMigrateIndexPatternDatasource
migration (8.6.0+
).
I took a brief look just to confirm, but we didn't perform any structural (or just any fwiw) changes on the document migrator during this, or prior minor, versions.
This, in addition to the fact that faulty documents have the correct migrationVersion
/ typeMigrationVersion
property version set, and that there were no similar case reported for "real SO" (non-embedded) visualization makes me suspect it might be a problem in the way the persistable state migration works?
Could it be that error in chained migration are shallowed in any way? Could a failure in one of the "composite" migration silently trap the error, while not running the following migrations of the composite? It could lead to such state where the document's migrationVersion
is set, but the embeddable state not fully migrated?
Who would be the best person(s) to answer this?
I see your points Pierre. I am pinging here @ThomThomson as they own the by value migrations on the dashboard and he might have some insights on how these migrations work.
Some observations we made with @stratoula this morning:
On impacted dashboard, all lens
embeddables were not migrated. it's not like some were and some other were not.
We tried to check if other type of embeddables on impacted dashboards were also non-migrated, but there's very few migrations registered for legacy visualization, and our data sample didn't allow us to get any concrete result, so we don't know if it's lens specific or not.
The dashboards have an up-to-date migrationVersion
(or typeMigrationVersion
), so it went though the SO migration (document migrator) process.
The non-migrated embeddables have an up-to-date version
too (attributes.panelsJSON[X].version
), so the embeddable migration at least did something / applied the version to these embeddables
We couldn't figure out if all dashboards are impacted during an upgrade, or only a few. The problem is only visible for dashboard with lens visualizations, and no SDH explicitly stated that some of their other dashboards with lens vis were okay, so we couldn't say for sure.
It only occurs for embedded lens. Impacted deployments reported being able to access their by-reference lens, which were migrated correctly.
We couldn't (of course...) reproduce this, either locally or on Cloud, either by manually recreate dashboard in a prior version and upgrading, or even retro-importing a faulty dashboard (with versions updated) in a prior version and upgrade.
We can't say if it's specifically related to this 8.6.0
migration, but our gut feeling is that it's not. It's just the only disruptive migration that surfaces errors to the end user (therefore the SDHs). We may have other "silent" occurrences of the problem with other versions.E.g commonMigratePartitionMetrics was also not applied on those lens embeddables, but this migration doesn't directly cause errors.
Hello,
We are one of elastic's customers who ran into the issue and raised it with support. I can confirm that we had 2 dashboards using all lens visualisations (no reference to Lens SO, everything is saved on the dashboard SO). One of the dashboard migrated fine when we upgraded from v7.17.10 to v8.8.0 but the other ran into problem.
Both the dashboards use similar lens visualisations (similar aggregations) but on different data sources. Let me know if you need any further details.
@xPB12 thanks for the information, it would confirm that the migration outcome can be different per document on a given cluster.
Let me know if you need any further details.
it would be very useful to us if you could provide:
8.8.0
index (by just exporting them using the management/export feature)
7.17.10
index.It would help us trying to find differences that could explain the different outcome during the migration.
You can use the following es query for that:
GET /.kibana_7.17.10_001/_doc/dashboard:{ID-OF-THE-DASHBOARD}
Do the dashboard definitions contain anything sensitive? Are you fine sharing them on this issue or should we go though support or another mechanism to transfer them?
I checked the migrations on the dashboard ndjson provided by the customer and I can say with certainty that the 8.1.0 and 8.3.0 haven't ran also. For the rest I can't confirm because they don't have embeddables to satisfy the criteria. So we can say that the problem is not the 8.6 migrations but something else is going on. My assumption is that none of the migrations ran.
The issue was resolved for the non working dashboard by changing a key in the saved object from indexpattern to formBased. Both the working and the culprit saved object json is available in the support case #01379924.
I've also attached both the dashboard version from GET /.kibana_7.17.10_001/_doc/dashboard:{ID-OF-THE-DASHBOARD} query as well to the case.
Thanx @xPB12, I don't see a significant difference between these 2 dashboards cc @pgayvallet
We can create a fallback for this migration (so if it skipped at least the visualizations will work) but this is not the solution for this problem
The problem is that the migrations are skipped on a dashboard for by value panels for some reason that we can't identify atm. The problem is not this migration particularly but something on the migration system.
Pinging @elastic/kibana-presentation (Team:Presentation)
@xPB12 we created a fallback for the missing datasource issue that is going to be released in 8.9. So these panels are not going to break even if the migration is not going to run.
Can you do a test for us? If you import the 7.17 dashboard in 8.7 from the Saved objects management page, are the panels being depicted correctly?
Today @delvedor showed me another possible instance of this bug, with a missing new metric migration within a dashboard.
I'm running into this problem trying to upgrade from a fork of 8.2.2 to a fork of 8.8.2. I applied the same fixes in https://github.com/elastic/kibana/commit/a67d83821f5a81448246ba23f96a524509d4e932 to to my fork of 8.8.2 and can confirm the visualizations now render properly.
However, attempting to edit one of the Lens visualizations results in broken visualization form/control that displays the following stack trace:
TypeError: Cannot read properties of undefined (reading 'layers')
at Object.uniqueLabels (https://localhost:5601/9007199254740991/bundles/plugin/lens/1.0.0/lens.chunk.4.js:19215:28)
at LayerPanel (https://localhost:5601/9007199254740991/bundles/plugin/lens/1.0.0/lens.chunk.2.js:19215:347)
at Gh (https://localhost:5601/9007199254740991/bundles/kbn-ui-shared-deps-npm/kbn-ui-shared-deps-npm.dll.js:225369:137)
at rk (https://localhost:5601/9007199254740991/bundles/kbn-ui-shared-deps-npm/kbn-ui-shared-deps-npm.dll.js:225487:63)
at qk (https://localhost:5601/9007199254740991/bundles/kbn-ui-shared-deps-npm/kbn-ui-shared-deps-npm.dll.js:225467:129)
at pk (https://localhost:5601/9007199254740991/bundles/kbn-ui-shared-deps-npm/kbn-ui-shared-deps-npm.dll.js:225466:435)
at ik (https://localhost:5601/9007199254740991/bundles/kbn-ui-shared-deps-npm/kbn-ui-shared-deps-npm.dll.js:225466:265)
at Yj (https://localhost:5601/9007199254740991/bundles/kbn-ui-shared-deps-npm/kbn-ui-shared-deps-npm.dll.js:225459:163)
at https://localhost:5601/9007199254740991/bundles/kbn-ui-shared-deps-npm/kbn-ui-shared-deps-npm.dll.js:225334:255
at exports.unstable_runWithPriority (https://localhost:5601/9007199254740991/bundles/kbn-ui-shared-deps-npm/kbn-ui-shared-deps-npm.dll.js:225557:343)
Update:
I've confirmed that replacing occurrences of \"datasourceStates\":{\"indexpattern\"
with \"datasourceStates\":{\"formBased\"
corrects the dashboard with embedded Lens visualizations when both viewing and attempting to edit an embedded Lens SO. However, it seems the PR to mitigate the issue only accounts for rendering of those visualizations. Attempting to edit them results in the error I've shown.
To add to the above...
It seems after applying the manual fix to the embedded Lens SOs, several of my Metric visualizations stopped working and instead showed up blank. If I edit them and change their type from Metric (lnsMetric
) to Legacy Metric (lnsLegacyMetric
) then they work.
Also seems like any visualization of type lnsPie
isn't working:
Are these examples of other migrations that did not run?
@stratoula Not sure if this helps, but I found that the following seems to address all migration-related issues for me:
It seems like the migration process that happens automatically on Kibana startup does not work, but the migration process SOs go through via Import work fine.
Note that this workaround is only effective if you've managed to export the SOs before they've gone through any migration process. Trying to re-import something that Kibana already migrated does not work, as the version numbers on the SOs are bumped even though the migration changed nothing else.
@JAndritsch thanx a lot, yes we know that. Some times the internal kibana migrations do not work for some dashboards but if you import them from the SO management page the migrations will run. Thanx for looking into it
@stratoula Thanks for confirming.
I also found that I was able to take my badly-migrated dashboards (where only version numbers were updated) and run them through SO import after making the following changes:
coreMigrationVersion
from 8.8.0 to 8.2.2migrationVersion.dashboard: 8.2.0
typeMigrationVersion
Doing this effectively reverted the dashboards back to their pre-migrated state and allowed me to run them through a SO import to correctly migrate them forward to 8.8.2. This is a process that I was able to do even if I could not pre-export the unmigrated SOs before upgrade.
@stratoula have we seen any other instances of this recently?
@ThomThomson we didn't have any updates on this but this doesn't mean anything. With this PR https://github.com/elastic/kibana/pull/160129 we make Lens visualizations work even if the disturbing migration fails so users will never understand.
Unfortunately all the other migrations are not causing failures to the charts so I am not sure how easy it is for the users to undersatand that something hasn't run in order to report it.
Adding the label blocked as we still haven't managed to reproduce it but the bug still persists. I am keeping it open for visibility and for gathering more cases.
The new embeddable system moves migrations from server to client. This has 2 benefits 1) a failed migration will not block kibana from upgrading 2) Errors will no longer be silently swallowed. Instead, they will be diplayed for the individual panel that failed to migrate.
Kibana version: 8.6+
Describe the bug: For some reasons in specific dashboards the commonMigrateIndexPatternDatasource migration which migrates the indexpattern datasource to the formBased fails to run. As a result the Lens panels fail to load
We haven't found why this is happening. The migration is very simple, it populates a new formBased field to the datasourceStates object. We haven't managed to replicate it but we have customers that have encountered that.
Errors in browser console (if relevant): The more usual error is
Any additional context: To fix it:
\"datasourceStates\":{\"indexpattern\"
and replace with\"datasourceStates\":{\"formBased\"