opensearch-project / OpenSearch-Dashboards

📊 Open source visualization dashboards for OpenSearch.
https://opensearch.org/docs/latest/dashboards/index/
Apache License 2.0
1.65k stars 857 forks source link

Support rolling upgrades #365

Open dblock opened 3 years ago

dblock commented 3 years ago

Is your feature request related to a problem? Please describe.

Kibana does not support rolling upgrades per https://www.elastic.co/guide/en/kibana/current/upgrade-standard.html. Support rolling upgrades without downtime.

Describe the solution you'd like

A way to perform a rolling upgrade between versions of OpenSearch Dashboards without downtime with multiple nodes of OpenSearch Dashboards.

tmarkley commented 2 years ago

I think it's time to revive this conversation. @dblock are there any key insights from the work done to support rolling upgrades for OpenSearch that we could apply to Dashboards?

dblock commented 2 years ago

I think it's time to revive this conversation. @dblock are there any key insights from the work done to support rolling upgrades for OpenSearch that we could apply to Dashboards?

Big question! I think @saratvemulapalli will have a lot to say about this.

kavilla commented 2 years ago

Something to consider, one of the main reasons for the legacy application not supporting rolling upgrades was potential data loss due to mapping changes in the system index that we use for OpenSearch Dashboards. If a mapping was changed, an older version of OpenSearch Dashboards might still try to write documents in an old mapping when the system index has already been migrated to a new version.

shdubsh commented 2 years ago

If a mapping was changed, an older version of OpenSearch Dashboards might still try to write documents in an old mapping when the system index has already been migrated to a new version.

Is the "system index" the .opensearch_dashboards index (formerly .kibana index)? The following questions assume yes. Sorry in advance if this assumption is incorrect.

During a rolling upgrade, Dashboards instances that haven't been restarted still continue to work against a mixed-version cluster. Is this due to Dashboards waiting to run the mapping migration until the OpenSearch instance versions all match?

What if Dashboards entered a read-only mode until all OpenSearch cluster node versions match? Read-only would be preferable to a complete Dashboards outage for the duration of the upgrade. Read-only might also better protect against data loss since it could be reactive rather than a state entered into on startup and never checked again.

kavilla commented 2 years ago

Is the "system index" the .opensearch_dashboards index (formerly .kibana index)? The following questions assume yes. Sorry in advance if this assumption is incorrect.

We had a change for .opensearch_dashboards but then we restored .kibana since we got the legal go-ahead to do that. So right now OSD 1.x the system index is .kibana.

During a rolling upgrade, Dashboards instances that haven't been restarted still continue to work against a mixed-version cluster. Is this due to Dashboards waiting to run the mapping migration until the OpenSearch instance versions all match?

Do you mean during an OpenSearch rolling upgrade? For that the system index for OpenSearch Dashboards remains unchanged. So when the engines do a rolling upgrade, OpenSearch Dashboards is just doing a request on .kibana. Any mapping migration in a release environment generally happens on the start up of OpenSearch Dashboards (I do believe you can configure a lazy migration but not sure if possible in a release environment).

What if Dashboards entered a read-only mode until all OpenSearch cluster node versions match? Read-only would be preferable to a complete Dashboards outage for the duration of the upgrade. Read-only might also better protect against data loss since it could be reactive rather than a state entered into on startup and never checked again.

This is a good idea! Was going to mention this but I guess in terms of a rolling upgrade, it doesn't really break the tenant of "without disruption of service". That could be a good middle ground and way better than a complete outage.

saratvemulapalli commented 2 years ago

I think it's time to revive this conversation. @dblock are there any key insights from the work done to support rolling upgrades for OpenSearch that we could apply to Dashboards?

Thats a good question. I'll try to answer in short but would more than happy to dive deeper.

When we did the fork for OpenSearch, it already supported rolling upgrades. Most of the work we did was to help communication between Elasticsearch 7.10.2 <-> OpenSearch 1.0 in transport. Since both of them are essentially the same code bases taking care of transport solved the problem. For OpenSearch plugins it was mostly updating various artifacts: https://github.com/opensearch-project/opensearch-plugins/issues/12

That said, I am curious what are the challenges of which prevent the support of rolling upgrades for OpenSearch Dashboards?