hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
31.29k stars 4.23k forks source link

Exponential backoff in the LifetimeWatcher does not send an error to doneCh if we reach the lease expiration while failing to renew #28611

Open jSherz opened 1 month ago

jSherz commented 1 month ago

Describe the bug

In https://github.com/hashicorp/vault/pull/26868, the backoff package was upgraded to V4. Part of the difference between V3 and V4 of backoff is that the backoff.ExponentialBackOff struct now takes a "Stop" field, indicating the value to return when we should no longer retry the operation.

In the constructor they provide (backoff.NewExpontentialBackOff - https://github.com/cenkalti/backoff/blob/v4/exponential.go#L94), the Stop value is set to the constant in the package, a duration of -1 second. The Vault code however directly initialises the struct (https://github.com/hashicorp/vault/blob/c8e6169d5dbc82d99d904e468852902de98ebfd0/api/lifetime_watcher.go) and thus we end up with a duration of zero seconds for the "Stop" field.

The lifetime watcher later checks the returned value from the backoff to see if it's backoff.Stop. It never is, as the ExponentialBackoff we've initialised returns a time.Duration of zero seconds.

To Reproduce

  1. Try to use the lifetime watcher and have it error out when we've unsuccessfully retried to renew the lease and the lease duration has expired.

Expected behavior

r.doneCh should receive an error.

Environment:

    github.com/hashicorp/vault/api v1.15.0
    github.com/hashicorp/vault/api/auth/kubernetes v0.8.0
jSherz commented 1 month ago

I had a go at a fix in https://github.com/hashicorp/vault/pull/28616 - please let me know what you think. Is the current behaviour the expected one?