hashicorp / vault-plugin-secrets-azure

Vault Azure Secrets plugin
Mozilla Public License 2.0
26 stars 20 forks source link

SIGSEGV on azure role unassignments #190

Closed Foxboron closed 4 months ago

Foxboron commented 4 months ago

We where in the middle of upgrading our vault cluster from 1.15.6 to 1.16.1 when the servers started failing. It seems like there is an issue when azure roles are being unassigned and it causes the vault service to crash.

I suspect the root of this issue is this change from the REST API to the graph API client and that rawResponse is still nil when we are peeking at the statusCode.

Replace go-autorest MS Graph client with msgraph-sdk-go (https://github.com/hashicorp/vault-plugin-secrets-azure/pull/169) https://github.com/hashicorp/vault-plugin-secrets-azure/commit/416c8fd8f567360cc8187bbc20c0d3fdb5c68d7e#diff-4b667feae66c9d46b21b9ecc19e8958cf4472d162ce0a47ac3e8386af8bbd8cfL213-R221

Traceback:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x6b18f07]
goroutine 52825 [running]:
github.com/hashicorp/vault-plugin-secrets-azure.(*client).unassignRoles(0xc00ab0eaa0, {0xc9c2b68, 0xc0089a1960}, {0xc00fb0c800?, 0x2e, 0xc00f888910?})
        /home/runner/go/pkg/mod/github.com/hashicorp/vault-plugin-secrets-azure@v0.17.0/client.go:221 +0x127
github.com/hashicorp/vault-plugin-secrets-azure.(*azureSecretBackend).rollbackRoleAssignWAL(0xc0038906c0, {0xc9c2b68, 0xc0089a1960}, 0xc0054030e0, {0x8cc67c0, 0xc00fa3be30})
        /home/runner/go/pkg/mod/github.com/hashicorp/vault-plugin-secrets-azure@v0.17.0/wal.go:157 +0x4ff
github.com/hashicorp/vault-plugin-secrets-azure.(*azureSecretBackend).walRollback(0x10f45f4?, {0xc9c2b68?, 0xc0089a1960?}, 0x3504?, {0xc00f6faad0?, 0xc00f9f9600?}, {0x8cc67c0

        /home/runner/go/pkg/mod/github.com/hashicorp/vault-plugin-secrets-azure@v0.17.0/wal.go:33 +0x94
github.com/hashicorp/vault/sdk/framework.(*Backend).handleWALRollback(0xc0062b0e10, {0xc9c2b68, 0xc0089a1960}, 0xc0054030e0)
        /home/runner/work/vault/vault/sdk/framework/backend.go:714 +0x3c3
github.com/hashicorp/vault/sdk/framework.(*Backend).handleRollback(0xc0062b0e10, {0xc9c2b68, 0xc0089a1960}, 0x270f?)
        /home/runner/work/vault/vault/sdk/framework/backend.go:657 +0xc6
github.com/hashicorp/vault/sdk/framework.(*Backend).HandleRequest(0xc0062b0e10, {0xc9c2b68, 0xc0089a1960}, 0xc0054030e0)
        /home/runner/work/vault/vault/sdk/framework/backend.go:219 +0x105
github.com/hashicorp/vault/builtin/plugin/v5.(*backend).HandleRequest(0xc0049a4280, {0xc9c2b68, 0xc0089a1960}, 0xc0054030e0)
        /home/runner/work/vault/vault/builtin/plugin/v5/backend.go:94 +0xbd
github.com/hashicorp/vault/vault.(*Router).routeCommon(0xc002ad2660, {0xc9c2b68, 0xc0089a1960}, 0xc0054030e0, 0x0)
        /home/runner/work/vault/vault/vault/router.go:803 +0x1686
github.com/hashicorp/vault/vault.(*Router).Route(...)
        /home/runner/work/vault/vault/vault/router.go:572
github.com/hashicorp/vault/vault.(*RollbackManager).attemptRollback(0xc003cbc840, {0xc9c2ac0, 0xc0039cb770}, {0xc00621ce66, 0x6}, 0xc003891080, 0x1)
        /home/runner/work/vault/vault/vault/rollback.go:341 +0x84d
github.com/hashicorp/vault/vault.(*RollbackManager).startOrLookupRollback.func1()
        /home/runner/work/vault/vault/vault/rollback.go:248 +0x35
github.com/gammazero/workerpool.worker(0x0?, 0x0?, 0x0?)
        /home/runner/go/pkg/mod/github.com/gammazero/workerpool@v1.1.3/workerpool.go:237 +0x22
created by github.com/gammazero/workerpool.(*WorkerPool).dispatch in goroutine 3248
        /home/runner/go/pkg/mod/github.com/gammazero/workerpool@v1.1.3/workerpool.go:197 +0x2b3
vault.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
vault.service: Failed with result 'exit-code'.
vault.service: Consumed 20.741s CPU time.
vault.service: Scheduled restart job, restart counter is at 6.
heatherezell commented 4 months ago

Fixed by https://github.com/hashicorp/vault-plugin-secrets-azure/pull/191

Foxboron commented 4 months ago

Closing this as there has been a 0.17.1 release with https://github.com/hashicorp/vault-plugin-secrets-azure/pull/194

We haven't tested this patch ourself yet, but I'll close this for good measure :)

Thanks for the quick fix @vinay-gopalan and @hsimon-hashicorp

jmarion-arista commented 2 months ago

I'm still seeing this issue with 0.17.1 (built into Vault Enterprise 1.16.2):

goroutine 9184635 [running]:
[ ... ]
github.com/hashicorp/vault-plugin-secrets-azure.(*client).unassignRoles(0xc0224dceb0, {0xd56c9e8, 0xc013974380}, {0xc00bdb13a0?, 0x1, 0xc023bc4a50?})
        /home/runner/go/pkg/mod/github.com/hashicorp/vault-plugin-secrets-azure@v0.17.1/client.go:221 +0x136 fp=0xc013921110 sp=0xc013921080 pc=0x7540e16
github.com/hashicorp/vault-plugin-secrets-azure.(*azureSecretBackend).rollbackRoleAssignWAL(0xc008d8f920, {0xd56c9e8, 0xc013974380}, 0xc0121070e0, {0x958ec60, 0xc007deaed0})
        /home/runner/go/pkg/mod/github.com/hashicorp/vault-plugin-secrets-azure@v0.17.1/wal.go:157 +0x4ff fp=0xc013921238 sp=0xc013921110 pc=0x754ebdf
github.com/hashicorp/vault-plugin-secrets-azure.(*azureSecretBackend).walRollback(0x10fe414?, {0xd56c9e8?, 0xc013974380?}, 0x20e?, {0xc012d61b50?, 0xc009828f40?}, {0x958ec60?, 0xc007deaed0?})
        /home/runner/go/pkg/mod/github.com/hashicorp/vault-plugin-secrets-azure@v0.17.1/wal.go:33 +0x94 fp=0xc013921288 sp=0xc013921238 pc=0x754dd74
github.com/hashicorp/vault-plugin-secrets-azure.(*azureSecretBackend).walRollback-fm({0xd56c9e8?, 0xc013974380?}, 0x13955140?, {0xc012d61b50?, 0x30acd41c6bfd?}, {0x958ec60?, 0xc007deaed0?})
        <autogenerated>:1 +0x58 fp=0xc0139212d8 sp=0xc013921288 pc=0x7551258
github.com/hashicorp/vault/sdk/framework.(*Backend).handleWALRollback(0xc008d9c2d0, {0xd56c9e8, 0xc013974380}, 0xc0121070e0)
        /home/runner/actions-runner/_work/vault-enterprise/vault-enterprise/sdk/framework/backend.go:714 +0x3c3 fp=0xc013921410 sp=0xc0139212d8 pc=0x10f1623
[ ... ]

We've filed Hashicorp support ticket #149475 to track this too.

I believe the cause is that the right side of this && operation should be in parenthesis:

if rawResponse != nil && (rawResponse.StatusCode == http.StatusNoContent || rawResponse.StatusCode == http.StatusNotFound) {
    continue
}