Open craftey opened 1 year ago
Hi @jefferai . I mention you here, because you gave some help some time ago here https://github.com/hashicorp/vault/issues/5746 with vault gcs storage backend. I politely want to ask if you maybe have some advice for us regarding the above issue. A shorter summary of the issue can also be read here https://discuss.hashicorp.com/t/transit-mount-storage-gcs-error-rollback-error-rolling-back-context-deadline-exceeded/58930. We want to update vault to latest version, but see errors in log with rollback-manger in conjunction with gcs-storage backend when using latest vault version. Thanks in advance.
Dear all Any feedback about this issue? Thanks
Hi @hsimon-hashicorp Can you get the right people looking at this? My original description contains a minimal example that should be easily reproducible. Also I pinned the version when the issue started happening. And I had some questions at the end of my post, unfortunately no one had time to quickly answers some of them. Thanks in advance.
I am experiencing the same error using the Transit engine with a Postgres
backend. In my case, I have around ~500,000 keys created. According to what can be seen in the code comments in vault/rollback.go
, this rollback is caused by partial errors in the operations, but it's not clear to me whether this means that certain operations for key creation, encryption, or decryption are not being performed. Is this critical? Is it possible to see in detail which errors are occurring?
Error:
[ERROR] rollback: error rolling back: path=transit/
error=
| 121794 errors occurred:
| \t* context deadline exceeded
| \t* context deadline exceeded
| \t* context deadline exceeded
...
Environment:
1.15.2
and 1.16.1
.0.27.0
and 0.28.0
.Thanks in advance,
Hi @hsimon-hashicorp Can you get the right people looking at this? My original description contains a minimal example that should be easily reproducible. Also I pinned the version when the issue started happening. And I had some questions at the end of my post, unfortunately no one had time to quickly answers some of them. Thanks in advance.
Thanks for the ping! I've re-surfaced this issue with our engineering teams. Hopefully we can collectively get to the bottom of this!
Hi @hsimon-hashicorp! Is there any news about this?
Thanks
We face the issue in our dev and production environments. To reproduce the issue with fresh vault I tested some versions locally on a MacBook with empty storage backend gcs. I found that version 1.9.4 and 1.9.10 do not have the issue. With 1.10.0 and eg 1.11.1 or 1.14.4 the error can be reproduced. So I believe this bug was introduced in 1.10.0 and has never been fixed in higher versions.
Bug description 3-5 mins after startup of the server and then every hour we see in the log:
To Reproduce
brew install vault
vault-config.hcl
with following contents, use a nice bucket name:vault server -config vault-config.hcl
add_keys.sh
:total=50 for c in
seq -w 0 $((total-1))
; do for i inseq -w 0 99
; do vault write -f customer-keys/keys/$c$i >/dev/null & done >/dev/null 2>&1 wait echo $((1$c+1-100))00/${total}00 donechmod +x add_keys.sh ./add_keys.sh
count written keys, 4000 keys or more is enough to reproduce the issue
vault list -format=yaml customer-keys/keys | wc -l
[ERROR] rollback: error rolling back: path=customer-keys/
curl https://raw.githubusercontent.com/Homebrew/homebrew-core/a0ce0e6ce3c921a26db90dfe8c38b4df9f227669/Formula/vault.rb > /tmp/vault.rb # version 1.10.0
brew reinstall --formula /tmp/vault.rb