Open evilin13 opened 3 years ago
Hi @evilin13,
Can you tell me more about the motivation behind the downgrade (and why staying with 1.8.15 isn't an option)?
Downgrades are not specifically supported or tested. Though downgrades can work in some cases, in most cases downgrades won't work because the snapshot from 1.8.15 will contain a Raft log entry that Consul 1.4.0 doesn't understand.
If you know in advance when performing an upgrade that you might want to downgrade, we suggest creating a snapshot before upgrading. Then, if you need to downgrade, you can restore using a snapshot generated from the old/matching version rather than from a newer version (which may fail).
Hi @jkirschner-hashicorp ,
Thank you very much for the prompt and detailed response! Rollback is one of the supported actions between two releases of the product I'm working for and recently I upgraded consul in the latest release. When the testing team performed the Rollback scenario to the previous release, we encountered the reported problem. So, I'm looking for a safe way to make the consul work in the rollback scenario even when the consul version changes between two releases of our product.
I tried to create the snapshots before upgrading, then removed all three raft.db files and downgraded to the old version.
Consuls started as expected, but before restoring the snapshots, I queried the Vault server (consul is used as the Vault server's backend in our case) and all the secrets were there. I'm not very familiar with Consul (yet), so I thought that by removing the raft.db files, I'll lose the KV store's content - last week I was removing the entire raft directory, not only the raft.db files, that's why I was losing all the data. :(
Given that I only need to ensure that the secrets/KV store content is not lost during the rollback and that all 3 consul servers will be operational after the rollback, would it be enough just to remove the raft.db files before starting the rollback and don't mess with creating/restoring snapshots at all?
Thank you!
just want to give this some more attention. It is common practices to test a downgrade before upgrading a production system. I'm to working on upgrading Consul but before we can go to production we need to check if a downgrade is possible. I hope there is some room to at least create some documentation on how to downgrade with what is and what is not supported
I’ve faced similar questions in the past from some of my customers and my recommendation have been that to support a rollback/downgrade customers should follow these steps:
It is important to note that Snapshot/backup from an higher version of Consul may not be compatible with a lower version and this will depend on underlying changes in raft storage etc. .
Overview of the Issue
When downgrading from consul version 1.8.15 to 1.4.0, none of the three servers can start. The following log is printed : "Failed to start Consul server: Failed to start Raft: failed to load any existing snapshots"
I removed the snapshot directories under the 3 consul data directories and restarted the server agents.
Now there is a panic error when starting the agents:
As I read (and also verified) here and here , removing the raft.db file fixes the issue but we cannot afford to lose the data stored there.
I also tried to:
consul conf file:
Could you please check this issue and suggest how could we overcome it? Could we probably say that the upgrade/downgrade between those versions is not supported?
Thank you, --Evi