hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
30.85k stars 4.17k forks source link

Issue with the vault cache in the secondary datacenter in multi-datacenter setup #18251

Closed alekhrj closed 1 year ago

alekhrj commented 1 year ago

Describe the bug Setup : multi DC vault setup with consul as the storage backend and the consul data is being replicated using consul-replicate

image4

Bug : vault caches all the responses (both correct and nil/error value) from the storage backend. If we try to read the value in the secondary DC before the replication, that data becomes unavailable even after successful replication

To Reproduce

Read a non-existent data in the secondary DC 1.1. Run vault read transit/keys/non-existent-key-1

Create two keys (one with same name as in step 1.1) in the primary DC 2.1. Run vault write -f transit/keys/non-existent-key-1 2.2. Run vault write -f transit/keys/non-existent-key-2

Read keys in the secondary cluster 3.1. Run vault read transit/keys/non-existent-key-1 3.2. Run vault read transit/keys/non-existent-key-2

Force leader election in the secondary cluster. Run the following on the active node. 4.1. Run vault operator step-down

Read keys in the secondary cluster 5.1. Run vault read transit/keys/non-existent-key-1 5.2. Run vault read transit/keys/non-existent-key-2

Expected behavior Step 3.1 should give error - “Error : No value found at transit/keys/non-existent-key-1”

While steps 3.2, 5.1 and 5.2 should not give errors. Proving that there is no issue with the replication.

Similar behavior is observed in other backends as well such as kv and database.

Environment:

* Vault CLI Version (retrieve with `vault version`):
Vault v1.13.0-dev1

* Server Operating System/Architecture:

cat /etc/os-release

PRETTY_NAME="Debian GNU/Linux 9 (stretch)" NAME="Debian GNU/Linux" VERSION_ID="9" VERSION="9 (stretch)" VERSION_CODENAME=stretch ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/"


Vault server configuration file(s):

```hcl
# primary DC
{
    "ui": true,
  "telemetry": {
    "statsd_address": "127.0.0.1:8125"
  },
  "storage": {
    "consul": {
      "address": "127.0.0.1:8500",
      "path": "vault",
      "obfuscate_paths": 1,
      "token": "e98xxxxx-8xxx-8xx5-xxx6-d2xxxa5f9xxx"
    }
  },
  "listener": {
    "tcp": {
      "address": "0.0.0.0:8200",
      "tls_disable": 1
    }
 },
  "plugin_directory": "/var/lib/fk-sec-vault/plugins"
}

# secondary DC
{
    "disable_cache": false,
    "ui": true,
  "telemetry": {
    "statsd_address": "127.0.0.1:8125"
  },
  "storage": {
    "consul": {
      "address": "127.0.0.1:8500",
      "path": "vault",
      "obfuscate_paths": 1,
      "token": "e98xxxxx-8xxx-8xx5-xxx6-d2xxxa5f9xxx",
    "consistency_mode": "strong",
    "cache_size": 1310720
    }
  },
  "listener": {
    "tcp": {
      "address": "0.0.0.0:8200",
      "tls_disable": 1
    }
 },
  "plugin_directory": "/var/lib/fk-sec-vault/plugins"
}
heatherezell commented 1 year ago

Hi there! Can you tell me more about your use case here, with using consul-replication for vault replication? Simply replicating the storage will not guarantee consistent results from Vault, as you've seen.

maxb commented 1 year ago

IIUC, the behaviour observed here actually counts as "Vault functioning as intended".

Vault expects and requires that it is the only thing that is writing to its storage, so by having consul-replicate write to storage that is also being used by a running Vault cluster, you create a situation where Vault will not function properly.

If you wanted to make this setup work, you would need to build a locking system so that consul-replicate, and Vault in the secondary datacenter, are never both running at the same time.

Even then, I'm not totally sure that consul-replicate would be able to preserve the ordering of operations well enough to guarantee it never places the destination Vault storage into a problematic state ... I'm thinking, what if it copies over storage entries encrypted by a new key, before the change to the keyring adding that new key?

alekhrj commented 1 year ago

I have some queries -

I am trying to understand the reason behind caching the nil data. one can be to prevent against DOS attack on non-existent key.

Also, I would like to understand what could possibly go wrong if we do not cache nil data as mentioned above - "you create a situation where Vault will not function properly."

"I'm thinking, what if it copies over storage entries encrypted by a new key, before the change to the keyring adding that new key?" -> are we talking about the rotation of root key scenario?

maxb commented 1 year ago

With your questions above, you've skipped past the really big important point to focus on some specific tiny parts of it, so I need to reiterate:

The Vault application is NOT written to support other software, such as consul-replicate, changing its data store underneath it whilst it is running.

You must not run consul-replicate whilst also running any Vault server in the secondary datacenter.

heatherezell commented 1 year ago

@maxb is correct here (thanks Max!) We cannot guarantee the results you're looking for if you change the storage. Caching, raft membership, and "eventual consistency" make this something that I don't believe we'll ever support. There are high-availability options available in Enterprise that can approximate what you're looking for (please see the page on Enterprise replication here ) but the way you're attempting to work around it will result in unexpected behavior at best and possible data corruption at worst. I'm going to close this issue now, but please feel free to open a new issue if you have further bug reports or enhancement requests. Thanks! :)

heatherezell commented 1 year ago

Although you're not using raft, apologies - the Integrated Storage versus Consul page makes it clear that Consul is all in-memory, which further complicates your use case here.