Closed jbouse closed 1 year ago
I experienced this on EKS 1.27, so it might not be version specific (or tightly constrained anyway).
What was really odd to me if that there really were not any significant differences between the stateful sets generated by 0.24.1 and 0.25. (Sorry all I have is a screenshot of it, not the actual text)
Yes, I also compared 0.24.1 and 0.25.0 and couldn't see anything drastic changed that would have caused this behavior. Still, on multiple EKS clusters, I observed the same behavior, and reverting cleared up the issue, so something less obvious has changed.
It then begs the question, is the change in the chart or vault 1.14.0 itself? Perhaps I'll try setting server.image.tag
to 1.14.0
on the 0.24.1 chart and see if I get the same failure. That would help narrow it down to the chart or Vault at fault.
I haven't looked into this yet, but I think you're onto the right track with it being 1.14.0 based on https://github.com/hashicorp/vault/issues/21465. This will be a pretty high priority to get fixed, I'd recommend following along in that issue for any updates.
@tomhjp you appear correct there... I just finished running with server.image.tag: 1.14.0
on chart v0.24.1 and it failed the same way I was seeing with chart 0.25.0. So it does appear to be related to the vault ticket against 1.14.0. Reading the comments in that ticket it seems to be recommending the fix to include the role_arn
and web_identity_token_file
which seems to indicate that vault 1.14.0 is failing to read the environment variables applied to the pod as both AWS_ROLE_ARN
and AWS_WEB_IDENTITY_TOKEN_FILE
are set along with other env variables the SDK should pick up and use.
Okay, so I'm convinced this is a vault 1.14.0 issue, not the chart at fault here... I've upgraded to the 0.25.0 chart but set server.image.tag: 1.13.4
, and it works perfectly fine. I'll maintain this until vault is fixed... I also found hashicorp/vault#21478, which describes what I've seen and includes a pointer to what broke.
I'll go ahead and close this, as it isn't the chart at fault.
@jbouse I tried setting the server.image.tag: 1.14.1
and its also working now
Can confirm it works as expected with the Chart's version 0.25.0
using the 1.14.1
image. I was also faced with the issue of my IRSA not being properly used with the 1.14.0
image tag.
tanks @brunooon 👍
Describe the bug Upgrading Helm chart from v0.24.1 to v0.25.0 on AWS EKS 1.24 cluster, upon restarting the new pod running 1.14.0 the container is unable to unseal the vault using AWS KMS key and log shows access denied to the KMS key with the assumed role of the EKS node not the one assigned to the service account via annotation.
I've observed the same behavior on multiple EKS clusters and reverting back to 0.24.1 chart works.
To Reproduce Steps to reproduce the behavior:
Expected behavior Expected that restarting the pod would come back online successfully as every other upgrade has in the past.
Environment
Chart values: