hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
30.11k stars 4.13k forks source link

Adjust Vault Agent Log Level Until Dependencies Resolved (or Resolution Fails) #13466

Open freimer opened 2 years ago

freimer commented 2 years ago

Debugging and troubleshooting is difficult with current log levels for Vault Agent. Specifically, If you use a parameterized path for kv secrets, such as the following:

{{- with secret (printf "kv/path_prefix/%s/env" (file "/vault/secrets/dag_id" | trimSpace)) -}}
{{ range $k, $v := .Data.data }}
{{ $k }}={{ $v }}
{{- end -}}
{{- end }}

You will get warning such as the following with the default INFO level logging:

2021-12-16T16:16:24.388Z [WARN] (view) vault.read(kv/path_prefix//env): no secret exists at kv/data/path_prefix/env (retry attempt 1 after "250ms")

This suggest that the file /vault/secrets/dag_id has a null content, or content that was nulled by the trimSpace. This causes customers to go down the rabbit hole and waste time thinking that there is an issue with this file.

Turning logging level to DEBUG shows the following (only pertinent logs shown):

2021-12-16T16:30:27.786Z [DEBUG] (runner) missing dependency: file(/vault/secrets/dag_id)
2021-12-16T16:30:27.786Z [DEBUG] (runner) missing dependency: vault.read(kv/path_prefix//env)
2021-12-16T16:30:27.786Z [DEBUG] (runner) add used dependency file(/vault/secrets/dag_id) to missing since isLeader but do not have a watcher
2021-12-16T16:30:27.786Z [DEBUG] (runner) add used dependency vault.read(kv/path_prefix//env) to missing since isLeader but do not have a watcher
2021-12-16T16:30:27.786Z [DEBUG] (watcher) adding file(/vault/secrets/dag_id)
2021-12-16T16:30:27.786Z [DEBUG] (watcher) adding vault.read(kv/path_prefix//env)
2021-12-16T16:30:27.811Z [WARN] (view) vault.read(kv/path_prefix//env): no secret exists at kv/data/path_prefix/env (retry attempt 1 after "250ms")
2021-12-16T16:30:27.899Z [DEBUG] (runner) missing dependency: file(/vault/secrets/dag_id)
2021-12-16T16:30:27.899Z [DEBUG] (runner) missing dependency: vault.read(kv/path_prefix//env)
2021-12-16T16:30:27.902Z [DEBUG] (runner) missing dependency: vault.read(kv/path_prefix/dag_id_1234/env)
2021-12-16T16:30:27.902Z [DEBUG] (runner) add used dependency vault.read(kv/path_prefix/dag_id_1234/env) to missing since isLeader but do not have a watcher
2021-12-16T16:30:27.902Z [DEBUG] (runner) was not watching 1 dependencies
2021-12-16T16:30:27.902Z [DEBUG] (watcher) adding vault.read(kv/path_prefix/dag_id_1234/env)
more logs...

So it is clear that the contents of the vault/secrets/dag_id file is dag_id_1234, which exists before vault agent is run.

What I'd like to see is that WARN log message turned into a DEBUG, or at most INFO, if there are pending dependencies that have not been resolved yet. There is no reason to log this as a WARN at this point in the process. It distracts and wastes time in the troubleshooting process. Normal processing should not produce WARN log messages. I can see CRITICAL and ERROR messages if the vault agent process itself fails, due to internal panic or something. WARN messages perhaps if the end-result of the resolution of dependencies fail and vault agent is ultimately not able to render the file. But for situations such as these where there is no issue rendering the file, there should be no WARN messages.

There really are no alternatives to fixing the log levels. For customers that parse the logs and show service health or even alert on warning messages, this would mean either false positives, or having to create custom parsers, perhaps specific to individual applications, to ignore these log messages.

hsimon-hashicorp commented 2 years ago

Thanks for this suggestion! This will need some conversations within the engineering teams, I think, which may lead to some delays in replies. But we'll be taking it under advisement. :)