hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
30.06k stars 4.12k forks source link

Vault cannot be unsealed with error "Vault is not initialized" But It is already initialized #15680

Closed shujun10086 closed 2 years ago

shujun10086 commented 2 years ago

Describe the bug The Vault cannot be unsealled anymore after restart. Vault server is restarted due to liveness probe fail Use curl -m 1 http://127.0.0.1:5817/v1/sys/health to check Vault health. The liveness probe internal is 5s and 3 times.

After it restarted, our own programe will unseal it by the restored unseal key and root token file.

It will check Vault init status by http://127.0.0.1:5817/v1/sys/init, Then the response is Vault init true. But it cannot unseal Vault anymore. URL: PUT http://127.0.0.1:5817/v1/sys/unseal Code: 400. Errors:

Check the Vault DB files. It seems _keyring file is not exist. The following is the Vault DB core files. usually there will be a _keyring file. When problem happend, It seems gone. But not sure if it is just a result as Vault restart.

bash-5.1$ ls -l /mnt/services/vault/DB/core/ total 6 -rw-------. 1 9999 9999 397 May 24 09:31 _audit -rw-------. 1 9999 9999 537 May 24 09:31 _auth -rw-------. 1 9999 9999 133 May 24 09:31 _local-audit -rw-------. 1 9999 9999 133 May 24 09:31 _local-auth -rw-------. 1 9999 9999 417 May 24 09:31 _local-mounts -rw-------. 1 9999 9999 209 May 29 20:51 _master -rw-------. 1 9999 9999 709 May 24 09:31 _mounts -rw-------. 1 9999 9999 169 May 24 09:31 _seal-config -rw-------. 1 9999 9999 101 May 24 09:31 _shamir-kek drwx------. 3 9999 9999 2 May 24 09:31 cluster drwx------. 2 9999 9999 1 May 24 09:31 hsm drwx------. 2 9999 9999 1 May 24 09:31 wrapping

To Reproduce Steps to reproduce the behavior:

  1. Run the VaultServer until it is killed by livenessProbe
  2. After it is restarted, it cannot be unseal anymore

Still not clear why VaultServer does not response the health request. So it is difficult to reproduce

Expected behavior After Vault restart, it still can be unsealled normally.

Environment:

Vault server configuration file(s):

# Paste your Vault config here.
# Be sure to scrub any sensitive values

bash-5.1$ cat /etc/vaultserver/server.hcl storage "file" { path = "/mnt/services/vault/DB" }

listener "tcp" { address = "127.0.0.1:5817" tls_disable = 1 }

cache_size = 100 disable_mlock = true

Additional context Add any other context about the problem here. If you can explain from the Vault source code point of view, why the "Vault is not initialized" when Vault is inited already.

maxb commented 2 years ago

Vault uses the presence of the keyring to test whether it has been initialized. So, mysterious loss of the keyring file from backing storage would seem to explain this behaviour.

shujun10086 commented 2 years ago

I suspect when Vault rotate the keyring file, it stuck and kill by liveness probe, the file will be gone as it maybe delete during rotating. From source code, It will rotate it in a stable time. If the time can be configured?

Another question is why http://127.0.0.1:5817/v1/sys/init return success as Vault is already init. Does it not use the keyring file to test if it init or not ?

shujun10086 commented 2 years ago

If Vault can make sure the keyring rotate successfully when it handle the kill 15 signal ?

sathuish commented 1 year ago

Is there a way to generate the file manually becoz we have faced it in our prod environment?

shujun10086 commented 1 year ago

Is there a way to generate the file manually becoz we have faced it in our prod environment?

It seems it cannot. The file is updated every 5 minutes by default. And it seems even I save the file and do manual replacing, the unseal still cannot be successful. Which version of vault do you use?

sathuish commented 1 year ago

vault - 1.7.1. Any other way to unseal the vault now?

maxb commented 1 year ago

Question was also asked at https://discuss.hashicorp.com/t/keyring-file-is-missing-under-core-directory/43588 . My answer from there:

There is no way - the keyring file contained the encryption keys with which all the user data in Vault is encrypted.

With it lost, unless you have your own backup elsewhere, all the data is permanently lost.