Vault "init" API used instead of "health" API - Causes Failures With Production Vault

adawalli commented 4 years ago

Nomad version

Nomad v0.12.0 (8f7fbc8e7b5a4ed0d0209968faf41b238e6d5817)

Operating system and Environment details

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:    16.04
Codename:   xenial

Issue

In vault implementations where users are locked out of the /sys/init endpoint (https://www.vaultproject.io/api-docs/system/init), Nomad will fail to be able to use Vault at all. Instead, the /sys/health (https://www.vaultproject.io/api/system/health.html) endpoint should be used instead.

We pay for an enterprise version of vault and have /sys/init locked down for security reasons. We will be unable to transition to using Nomad with the current implementation as we depend on Vault for secrets management.

Nomad Server logs

Jul 23 06:13:02 sectools-nomad-stage-01 nomad[6867]:     2020-07-23T06:13:02.797-0700 [WARN]  nomad.vault: failed to contact Vault API: retry=30s error="Get "https://vault.<domain>.com:8200/v1/sys/init": dial tcp 52.11.6.167:8200: i/o timeout"
Jul 23 06:13:02 sectools-nomad-stage-01 nomad[6867]:  nomad.vault: failed to contact Vault API: retry=30s error="Get "https://vault.<domain>.com:8200/v1/sys/init": dial tcp 52.11.6.167:8200: i/o timeout"

notnoop commented 4 years ago

@adawalli Thanks for reaching out - this is a good point. The health endpoint does seem more appropriate indeed; and I'll open a PR for changing it.

Curious about the security implications. Would it make sense to only lock down the PUT/POST methods on that endpoint but allow GET? Reading the Vault docs, GET /v1/sys/init seems safe.

adawalli commented 4 years ago

@notnoop - I will try to get approvals in place temporarily to possibly get GET access to sys/init. However, just because a vault is initialized doesn't mean it's active and usable. It has to be initialized, unsealed, and active so it seems that nomad's usage of the init endpoint is flawed anyway.

Thank you for considering an enhacement.

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

hashicorp / nomad