LINBIT / linstor-server

High Performance Software-Defined Block Storage for container, cloud and virtualisation. Fully integrated with Docker, Kubernetes, Openstack, Proxmox etc.
https://docs.linbit.com/docs/linstor-guide/
GNU General Public License v3.0
984 stars 76 forks source link

linstor resource make-available failed due to an unknown exception on shared-lun #351

Open kvaps opened 1 year ago

kvaps commented 1 year ago

Hi, I created shared LUN and trying to use it from Kubernetes:

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: linstor-shared-lun
parameters:
  linstor.csi.linbit.com/layerList: storage
  linstor.csi.linbit.com/storagePool: shared-lun
provisioner: linstor.csi.linbit.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: disk-vm0-boot
  namespace: default
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: linstor-shared-lun
  volumeMode: Block

but volume can't be attached on the node by CSI-plugin due to unknown exception:

  Warning  FailedAttachVolume  70s                    attachdetach-controller  AttachVolume.Attach failed for volume "pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6" : rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6: Message: 'Registration of resource 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6' on node hf-virt-02 failed due to an unknown exception.'; Details: 'Node: hf-virt-02, Resource: 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6''; Reports: '[642FF34D-00000-000026]'

I tried to reproduce this manually:

# linstor r mkavail hf-virt-02 pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6
ERROR:
Description:
    Registration of resource 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6' on node hf-virt-02 failed due to an unknown exception.
Details:
    Node: hf-virt-02, Resource: 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6'
Show reports:
    linstor error-reports show 642FF34D-00000-000028
command terminated with exit code 10

however resource create command works fine:

# linstor r c hf-virt-02 pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6
SUCCESS:
Description:
    New resource 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6' on node 'hf-virt-02' registered.
Details:
    Resource 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6' on node 'hf-virt-02' UUID is: ccd97506-a922-4b11-bd20-3e46bff2bb1b
SUCCESS:
Description:
    Volume with number '0' on resource 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6' on node 'hf-virt-02' successfully registered
Details:
    Volume UUID is: 3fbf37db-ac4c-4931-adcf-93d62e18437e
SUCCESS:
    Added peer(s) 'hf-virt-02' to resource 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6' on 'hf-virt-01'
SUCCESS:
    Created resource 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6' on 'hf-virt-02'
INFO:
    Updated 0 resync-after entries.

error report:

ERROR REPORT 642FF34D-00000-000028

============================================================

Application:                        LINBIT�� LINSTOR
Module:                             Controller
Version:                            1.21.0
Build ID:                           b44bb8d41f264ac1089d9a0a1c540d3cc703d7e8
Build time:                         2023-04-04T10:11:03+00:00
Error time:                         2023-04-17 12:29:53
Node:                               linstor-controller-6cb66d64ff-78jbz
Peer:                               RestClient(169.254.42.1; 'PythonLinstor/1.17.0 (API1.0.4): Client 1.17.0')

============================================================

Reported error:
===============

Category:                           RuntimeException
Class name:                         NullPointerException
Class canonical name:               java.lang.NullPointerException
Generated at:                       Method 'makeRscAvailableInTransaction', Source file 'CtrlRscMakeAvailableApiCallHandler.java', Line #287

Error context:
    Registration of resource 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6' on node hf-virt-02 failed due to an unknown exception.

Asynchronous stage backtrace:

    Error has been observed at the following site(s):
        |_ checkpoint ? Make resource available
    Stack trace:

Call backtrace:

    Method                                   Native Class:Line number
    makeRscAvailableInTransaction            N      com.linbit.linstor.core.apicallhandler.controller.CtrlRscMakeAvailableApiCallHandler:287

Suppressed exception 1 of 1:
===============
Category:                           RuntimeException
Class name:                         OnAssemblyException
Class canonical name:               reactor.core.publisher.FluxOnAssembly.OnAssemblyException
Generated at:                       Method 'makeRscAvailableInTransaction', Source file 'CtrlRscMakeAvailableApiCallHandler.java', Line #287

Error message:
Error has been observed at the following site(s):
    |_ checkpoint ��� Make resource available
Stack trace:

Error context:
    Registration of resource 'pvc-4c9f79c4-ba88-4b5a-b4d1-b15728e244a6' on node hf-virt-02 failed due to an unknown exception.

Call backtrace:

    Method                                   Native Class:Line number
    makeRscAvailableInTransaction            N      com.linbit.linstor.core.apicallhandler.controller.CtrlRscMakeAvailableApiCallHandler:287
    lambda$makeResourceAvailable$0           N      com.linbit.linstor.core.apicallhandler.controller.CtrlRscMakeAvailableApiCallHandler:152
    doInScope                                N      com.linbit.linstor.core.apicallhandler.ScopeRunner:150
    lambda$fluxInScope$0                     N      com.linbit.linstor.core.apicallhandler.ScopeRunner:76
    call                                     N      reactor.core.publisher.MonoCallable:91
    trySubscribeScalarMap                    N      reactor.core.publisher.FluxFlatMap:126
    subscribeOrReturn                        N      reactor.core.publisher.MonoFlatMapMany:49
    subscribe                                N      reactor.core.publisher.Flux:8343
    onNext                                   N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:188
    request                                  N      reactor.core.publisher.Operators$ScalarSubscription:2344
    onSubscribe                              N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:134
    subscribe                                N      reactor.core.publisher.MonoCurrentContext:35
    subscribe                                N      reactor.core.publisher.Flux:8357
    onNext                                   N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:188
    request                                  N      reactor.core.publisher.Operators$ScalarSubscription:2344
    onSubscribe                              N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:134
    subscribe                                N      reactor.core.publisher.MonoCurrentContext:35
    subscribe                                N      reactor.core.publisher.Mono:4252
    subscribeWith                            N      reactor.core.publisher.Mono:4363
    subscribe                                N      reactor.core.publisher.Mono:4223
    subscribe                                N      reactor.core.publisher.Mono:4159
    subscribe                                N      reactor.core.publisher.Mono:4131
    doFlux                                   N      com.linbit.linstor.api.rest.v1.RequestHelper:304
    makeResourceAvailable                    N      com.linbit.linstor.api.rest.v1.Resources:309
    invoke0                                  Y      jdk.internal.reflect.NativeMethodAccessorImpl:unknown
    invoke                                   N      jdk.internal.reflect.NativeMethodAccessorImpl:62
    invoke                                   N      jdk.internal.reflect.DelegatingMethodAccessorImpl:43
    invoke                                   N      java.lang.reflect.Method:566
    lambda$static$0                          N      org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory:52
    run                                      N      org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1:124
    invoke                                   N      org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher:167
    doDispatch                               N      org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker:159
    dispatch                                 N      org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher:79
    invoke                                   N      org.glassfish.jersey.server.model.ResourceMethodInvoker:469
    apply                                    N      org.glassfish.jersey.server.model.ResourceMethodInvoker:391
    apply                                    N      org.glassfish.jersey.server.model.ResourceMethodInvoker:80
    run                                      N      org.glassfish.jersey.server.ServerRuntime$1:253
    call                                     N      org.glassfish.jersey.internal.Errors$1:248
    call                                     N      org.glassfish.jersey.internal.Errors$1:244
    process                                  N      org.glassfish.jersey.internal.Errors:292
    process                                  N      org.glassfish.jersey.internal.Errors:274
    process                                  N      org.glassfish.jersey.internal.Errors:244
    runInScope                               N      org.glassfish.jersey.process.internal.RequestScope:265
    process                                  N      org.glassfish.jersey.server.ServerRuntime:232
    handle                                   N      org.glassfish.jersey.server.ApplicationHandler:680
    service                                  N      org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer:356
    run                                      N      org.glassfish.grizzly.http.server.HttpHandler$1:200
    doWork                                   N      org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker:569
    run                                      N      org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker:549
    run                                      N      java.lang.Thread:829

END OF ERROR REPORT.