hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
30.1k stars 4.13k forks source link

`VAULT_AGENT_ADDR` has higher precedence than `-agent-address` #27404

Open rrjjvv opened 1 month ago

rrjjvv commented 1 month ago

Describe the bug From the docs:

Flags always take precedence over the environment variables.

Thus, given

export VAULT_AGENT_ADDR=http://env-host
vault <cmd> -agent-address http://cli-host <...>

I would expect the 'cli-host' to be used. I do not know if this holds true for every sub-command, but vault status and vault read use the environment variable (I suspect it affects the majority of sub-commands, if not all).

To Reproduce I encountered this issue in a live environment (actual running server and agent), but the actual issue does not require these and using -output-curl-string is a convenient/quick way to illustrate in isolation.

$ export VAULT_AGENT_ADDR=http://env.invalid
$ env | grep -i vault
VAULT_AGENT_ADDR=http://env.invalid
$ vault status -agent-address http://cli.invalid -output-curl-string 
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://env.invalid/v1/sys/seal-status

Expected behavior

$ export VAULT_AGENT_ADDR=http://env.invalid
$ env | grep -i vault
VAULT_AGENT_ADDR=http://env.invalid
$ vault status -agent-address http://cli.invalid -output-curl-string 
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://cli.invalid/v1/sys/seal-status

(using address http://cli.invalid, not http://env.invalid)

Environment:

Vault server configuration file(s): N/A

Additional context I created a longer reproduction to

#!/bin/sh

: "${VAULT:=./bin/vault}"

export CLI_AGENT_ADDR=http://cli-agent.test
export CLI_ADDR=http://cli-server.test
export ENV_AGENT_ADDR=http://env-agent.test
export ENV_ADDR=http://env-server.test

unset VAULT_ADDR
unset VAULT_AGENT_ADDR

"$VAULT" -version
echo
(
  echo "[WORKS] expect cli-agent.test (agent flag trumps server flag)"
  "$VAULT" status -address="$CLI_ADDR" -agent-address="$CLI_AGENT_ADDR" -output-curl-string
)
echo
(
  echo "[WORKS] expect env-agent.test (agent env trumps server env)"
  export VAULT_ADDR=$ENV_ADDR
  export VAULT_AGENT_ADDR=$ENV_AGENT_ADDR
  "$VAULT" status -output-curl-string
)
echo
(
  echo "[WORKS] expect env-agent.test (agent env trumps server flag, even though flags wins for given setting)"
  export VAULT_AGENT_ADDR=$ENV_AGENT_ADDR
  "$VAULT" status -address="$CLI_ADDR" -output-curl-string
)
echo
(
  echo "[WORKS] expect cli-server.test (server flag trumps server env)"
  export VAULT_ADDR=$ENV_ADDR
  "$VAULT" status -address="$CLI_ADDR" -output-curl-string
)
echo
(
  echo "[OOPS] expect cli-agent.test (agent flag trumps agent env)"
  export VAULT_AGENT_ADDR=$ENV_AGENT_ADDR
  "$VAULT" status -agent-address="$CLI_AGENT_ADDR" -output-curl-string
)

I ran this against a few release binaries; all behaved the same, so only showing the oldest version I tried:

$ VAULT=~/bin/vault sh run.sh 
Vault v1.11.6 (a40b3ff5d46f3a9a0575835e37ec58f2696c140a), built 2022-11-23T12:36:56Z

[WORKS] expect cli-agent.test (agent flag trumps server flag)
curl -H "X-Vault-Token: $(vault print token)" -H "X-Vault-Request: true" http://cli-agent.test/v1/sys/seal-status

[WORKS] expect env-agent.test (agent env trumps server env)
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://env-agent.test/v1/sys/seal-status

[WORKS] expect env-agent.test (agent env trumps server flag, even though flags wins for given setting)
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://env-agent.test/v1/sys/seal-status

[WORKS] expect cli-server.test (server flag trumps server env)
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://cli-server.test/v1/sys/seal-status

[OOPS] expect cli-agent.test (agent flag trumps agent env)
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://env-agent.test/v1/sys/seal-status

This is from main (at some point last night), which also has the issue, but with some selective logging to show the flow:

 $ sh run.sh 
Vault v1.18.0-beta1 ('0513545dd8213ffcbb3406c25cda69cd0a5b0e47+CHANGES'), built 2024-06-06T17:37:50Z

[WORKS] expect cli-agent.test (agent flag trumps server flag)
base.go:116: BaseCommand.Client(): overriding server from https://127.0.0.1:8200 to http://cli-server.test using server flag
base.go:120: BaseCommand.Client(): overriding server from http://cli-server.test to http://cli-agent.test using proxy flag
client.go:666: NewClient(): setting adress to http://cli-agent.test (from config's server address)
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://cli-agent.test/v1/sys/seal-status

[WORKS] expect env-agent.test (agent env trumps server env)
client.go:407: ReadEnvironment(): read server address http://env-server.test from environment
client.go:413: ReadEnvironment(): read agent address http://env-agent.test from environment
base.go:116: BaseCommand.Client(): overriding server from http://env-server.test to http://env-server.test using server flag
base.go:120: BaseCommand.Client(): overriding server from http://env-server.test to http://env-agent.test using proxy flag
client.go:666: NewClient(): setting adress to http://env-agent.test (from config's server address)
client.go:669: NewClient(): overwriting address to http://env-agent.test (from config's agent address)
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://env-agent.test/v1/sys/seal-status

[WORKS] expect env-agent.test (agent env trumps server flag, even though flags wins for given setting)
client.go:413: ReadEnvironment(): read agent address http://env-agent.test from environment
base.go:116: BaseCommand.Client(): overriding server from https://127.0.0.1:8200 to http://cli-server.test using server flag
base.go:120: BaseCommand.Client(): overriding server from http://cli-server.test to http://env-agent.test using proxy flag
client.go:666: NewClient(): setting adress to http://env-agent.test (from config's server address)
client.go:669: NewClient(): overwriting address to http://env-agent.test (from config's agent address)
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://env-agent.test/v1/sys/seal-status

[WORKS] expect cli-server.test (server flag trumps server env)
client.go:407: ReadEnvironment(): read server address http://env-server.test from environment
base.go:116: BaseCommand.Client(): overriding server from http://env-server.test to http://cli-server.test using server flag
client.go:666: NewClient(): setting adress to http://cli-server.test (from config's server address)
curl -H "X-Vault-Token: $(vault print token)" -H "X-Vault-Request: true" http://cli-server.test/v1/sys/seal-status

[OOPS] expect cli-agent.test (agent flag trumps agent env)
client.go:413: ReadEnvironment(): read agent address http://env-agent.test from environment
base.go:116: BaseCommand.Client(): overriding server from https://127.0.0.1:8200 to https://127.0.0.1:8200 using server flag
base.go:120: BaseCommand.Client(): overriding server from https://127.0.0.1:8200 to http://cli-agent.test using proxy flag
client.go:666: NewClient(): setting adress to http://cli-agent.test (from config's server address)
client.go:669: NewClient(): overwriting address to http://env-agent.test (from config's agent address)
curl -H "X-Vault-Request: true" -H "X-Vault-Token: $(vault print token)" http://env-agent.test/v1/sys/seal-status

So the issue is that on the command side, the environment is read early https://github.com/hashicorp/vault/blob/d382103f62643780f406e16f385c536a177702c3/command/base.go#L101 and proceeds to (correctly) prioritize flag values (as well as "agent trumps server") https://github.com/hashicorp/vault/blob/d382103f62643780f406e16f385c536a177702c3/command/base.go#L105-L110 but the problem is that on the api side (which has no concept of flags), it performs the same "agent trumps server", with its only data source being the environment https://github.com/hashicorp/vault/blob/d382103f62643780f406e16f385c536a177702c3/api/client.go#L637-L639

In short, the config on the command side is deliberately/correctly configured before passing it to NewClient on the api side, but NewClient isn't fully respecting it.

I have a possible fix but I can't validate it, since the test suite (make test) has failures (on a pristine checkout). So before spending time trying to figure that out, I figured I'd wait for confirmation that it is indeed a bug and that it should be fixed. As well, the conversation on the PR that introduced these changes made it seem like "agent wins over server" had some nuances and that maybe this 'bug' isn't as straightforward as I thought, and maybe you want to handle it yourselves.

rrjjvv commented 1 month ago

I created a test for all sixteen possible combination of flags and environment variables (or lack thereof). Half of them (in my opinion) simply don't make sense and if they occurred, would probably be user or developer bugs. They might be best left as undefined behavior, but I still wanted to enumerate them, figure out my logical expectations, and see the behavior both pre- and post-fix.

Without changes, four out of the sixteen possibilities fail; one is my bug report, one is a variation of my report, and two are "this probably wouldn't happen in the real world". With my proposed changes, all sixteen pass (and all other existing tests continue to pass as well).

I'll create a PR early next week with my proposal unless you indicate you want to hash it out here, or handle it yourselves.