elastic / stack-docs

Elastic Stack Documentation
Other
95 stars 247 forks source link

Document upgrade order for tribe and CCS #30

Open ppf2 opened 6 years ago

ppf2 commented 6 years ago

This is one area that is not covered today in our docs today.

Even though tribe is deprecated, it is still in the product on 6.x. There will be questions around upgrading tribe node implementations for those who are not ready to switch to CCS yet. Afaik, they need to upgrade the tribe to 6.0 first before the downstream clusters because if the tribe remains on 5.x, it will not be able to join any downstream clusters that have indices created on 6.0. For example, can they upgrade tribe to 6.0 first and then do rolling restarts of downstream 5.6 clusters to 6.0? This is something we will have to sync up with dev on our recommendations. This will probably depend on the outcome of https://github.com/elastic/stack-docs/issues/17, but I do want to make sure that the results are documented :)

Note that some customers may resist switching to CCS right away because, eg.

So if we decide that we will not be testing/supporting tribe for rolling upgrades because the tribe node is deprecated, we will just have to document it to set the right expectations upfront.

Even for CCS, there will be questions on upgrade ordering (upgrade CCS first? Downstream clusters?).

zuketo commented 6 years ago

Hey Pius, just some quick notes on my own testing here for CCS:

  1. I first upgraded the CCS cluster to 5.6.0, to attempt a rolling upgrade of the CCS cluster to avoid downtime.

  2. Next I performed a rolling upgrade of the CCS cluster to 6.0:

Caused by: java.lang.IllegalStateException: Received message from unsupported version: [5.5.1] minimal compatible version is: [5.6.0]

Caused by: java.lang.IllegalStateException: Received message from unsupported version: [5.4.0] minimal compatible version is: [5.6.0]

  1. That didn't go well, CCS 6.0 needs underlying clusters to be at least 5.6.0

  2. Upgraded CC1 and CC2 to 5.6.0

  3. Upgraded CCS to 6.0

  4. Successful upgrade

I have some more research to do, e.g. can 5.6 CCS talk to 6.0 clusters? My guess is no, but I want to confirm first.

ppf2 commented 6 years ago

Thx for testing :)

Caused by: java.lang.IllegalStateException: Received message from unsupported version: [5.5.1] minimal compatible version is: [5.6.0]

Hmm, we will have to fix our doc here (https://www.elastic.co/guide/en/elasticsearch/reference/6.0/modules-cross-cluster-search.html) if that is intentional and not a bug, it says CCS can work with gateway eligible nodes on 5.5 and above.

zuketo commented 6 years ago

@ppf2 good catch, I'll follow up on that.

More testing notes, the following configuration didn't have any issues:

Cluster 1 on 6.0.0 (C1) Cluster 2 on 5.5.1 (C2) CCS Cluster on 5.6.0 (CCS)

For the next steps, I can draft an outline for this, then follow up with the ES team for additional help.

debadair commented 6 years ago

I'm happy to update the docs once we know what the behavior/supported strategy is. This can be called out in the upgrade path builder, as well.

ppf2 commented 6 years ago

With 6.0 GA targeted for tomorrow, it is unlikely that we will have a recommendation and have this documented by then. Can we set a target to have this done before 6.0.1? :)

zuketo commented 6 years ago

Hey @ppf2, yea, this will be ready post-GA, but hopefully we can get it done shortly after (although I don't know which version yet).

I'm tracking my notes here: https://docs.google.com/document/d/1U33G4oqCp6gin0CJc3jlG10C-DH8Sfe7yXPpC1FZDpQ/edit

clintongormley commented 6 years ago

Even though tribe is deprecated, it is still in the product on 6.x. There will be questions around upgrading tribe node implementations for those who are not ready to switch to CCS yet. Afaik, they need to upgrade the tribe to 6.0 first before the downstream clusters because if the tribe remains on 5.x, it will not be able to join any downstream clusters that have indices created on 6.0. For example, can they upgrade tribe to 6.0 first and then do rolling restarts of downstream 5.6 clusters to 6.0? This is something we will have to sync up with dev on our recommendations. This will probably depend on the outcome of #17, but I do want to make sure that the results are documented :)

I would say that you have to do the following: (eg with clusters of 5.1 and 5.4)

For CCS, cross major version search should work with a CCS gateway of 5.6, so:

With the security changes in x-pack for CCS, I'm not sure about the compatibility with < 5.6. @tvernum ?

@s1monw could you confirm?

zuketo commented 6 years ago

Quick note, Five9 asked for a status update on this:

@s1monw can you confirm the ordering above (in @clintongormley 's comment)?

s1monw commented 6 years ago

What Clinton said sounds good to me!

On 7. Dec 2017, at 22:50, Jason Zucchetto notifications@github.com wrote:

Quick note, Five9 asked for a status update on this:

@s1monw can you confirm the ordering above (in @clintongormley 's comment)?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ppf2 commented 6 years ago

Thanks all!

@zuketo @alexfrancoeur @tbragin @debadair

What are the next steps here in terms of getting this officially documented? I see 3 main areas to cover here:

  1. Users currently on tribe and for some reason they are not ready to migrate to CCS yet and want to continue to use tribe on 6.x (even though it is deprecated) for some time before they switch to CCS.

  2. (Likely not too many) Users currently on CCS (5.x) and upgrading to 6.x.

  3. Users currently on tribe and migrating to CCS as part of their upgrade to 6.x. I am assuming we will be recommending the same separate-CCS-cluster-in-parallel approach from the KB article Alex wrote for CCS beta here. Is this still the recommendation? Will we be updating the KB article to advertise the export/import API for Kibana? (or not ready to recommend it yet)? Seems like something we will want to start documenting officially instead of having the information in a KB article.

zuketo commented 6 years ago

Hey @ppf2, for next steps:

I'll complete a Google doc with the ES upgrade notes (covering all three of your points), which will hopefully have everything @debadair needs to convert into documentation. I believe Kibana should reference the ES documentation whenever there is overlap, to keep everything up-to-date/current in the future.

zuketo commented 6 years ago

@debadair I have a first draft of the Google doc here: https://docs.google.com/document/d/1U33G4oqCp6gin0CJc3jlG10C-DH8Sfe7yXPpC1FZDpQ/edit#

Please let me know if there is anything that needs additional work before we move this to a PR?

ppf2 commented 6 years ago

Added various comments to the draft to address before the PR :)

zuketo commented 6 years ago

Hi @debadair, do you have time to help with getting this added to the docs?