openebs-archive / cstor-operators

Collection of OpenEBS cStor Data Engine Operators
https://openebs.io
Apache License 2.0
94 stars 69 forks source link

Unable to remove failed pool from cspc with runtime error in cstor-admission-server #446

Closed jpflouret closed 1 year ago

jpflouret commented 1 year ago

I had a node failure and I manually deleted the cspi for the failed pool before removing the pool from the cstor pool cluster spec. Now I'm unable to remove the failed pool from the cspc. When I edit the spec and remove the pool I get the following error:

$ kubectl -n openebs edit cspc cstor-pool-cluster
error: cstorpoolclusters.cstor.openebs.io "cstor-pool-cluster" could not be patched: Internal error occurred: failed calling webhook "admission-webhook.cstor.openebs.io": failed to call webhook: Post "https://openebs-cstor-admission-server.openebs.svc:443/validate?timeout=5s": EOF

Logs for the admission server show the runtime error in ValidateScaledown in cscp.go

$ kubectl -n openebs logs openebs-cstor-admission-server-5f776fdfcc-4sbxz 
...
2023/04/24 04:38:37 http: panic serving 10.244.5.0:23642: runtime error: index out of range [0] with length 0
goroutine 39 [running]:
net/http.(*conn).serve.func1()
    /usr/local/go/src/net/http/server.go:1850 +0xbf
panic({0x1619340, 0xc0000af740})
    /usr/local/go/src/runtime/panic.go:890 +0x262
github.com/openebs/cstor-operators/pkg/webhook.(*PoolOperations).ValidateScaledown(0xc0005076c0)
    /go/src/github.com/openebs/cstor-operator/pkg/webhook/cspc.go:581 +0xb56
github.com/openebs/cstor-operators/pkg/webhook.(*webhook).validateCSPCUpdateRequest(0xc0003674d0, 0xc0002f8ea0, 0x17cd490)
    /go/src/github.com/openebs/cstor-operator/pkg/webhook/cspc.go:518 +0x533
github.com/openebs/cstor-operators/pkg/webhook.(*webhook).validateCSPC(0xc000489190?, 0x10?)
    /go/src/github.com/openebs/cstor-operator/pkg/webhook/cspc.go:121 +0x108
github.com/openebs/cstor-operators/pkg/webhook.(*webhook).validate(0xc00009ab70?, 0xc00009ac30)
    /go/src/github.com/openebs/cstor-operator/pkg/webhook/webhook.go:200 +0x22a
github.com/openebs/cstor-operators/pkg/webhook.(*webhook).Serve(0x11bfa0?, {0x191c9e8, 0xc000194460}, 0xc000182a00)
    /go/src/github.com/openebs/cstor-operator/pkg/webhook/webhook.go:244 +0x352
net/http.HandlerFunc.ServeHTTP(0xc000194460?, {0x191c9e8?, 0xc000194460?}, 0x16fd004?)
    /usr/local/go/src/net/http/server.go:2109 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc0001383ff?, {0x191c9e8, 0xc000194460}, 0xc000182a00)
    /usr/local/go/src/net/http/server.go:2487 +0x149
net/http.serverHandler.ServeHTTP({0x1910dd8?}, {0x191c9e8, 0xc000194460}, 0xc000182a00)
    /usr/local/go/src/net/http/server.go:2947 +0x30c
net/http.(*conn).serve(0xc0003feb40, {0x191d300, 0xc0003c6780})
    /usr/local/go/src/net/http/server.go:1991 +0x607
created by net/http.(*Server).Serve
    /usr/local/go/src/net/http/server.go:3102 +0x4db
...

Error occurs here: https://github.com/openebs/cstor-operators/blob/25db0f4b41be4f4eb3c13e16e3a456e6439b94b4/pkg/webhook/cspc.go#L581

I found this guide with instructions similar to what I was doing except I manually deleted the cspi which seems to be the problem in the code above.

I need to remove this pool instance from my cluster and I'm stuck on what to do next. Any help would be appreciated.

Thanks in advance.

jpflouret commented 1 year ago

Opened openebs/openebs#3629 as per README.md